Method and apparatus for document processing

ABSTRACT

Modular content framework and document format methods and systems are described. The described framework and format define a set of building blocks for composing, packaging, distributing, and rendering document-centered content. These building blocks define a platform-independent framework for document formats that enable software and hardware systems to generate, exchange, and display documents reliably and consistently. The framework and format have been designed in a flexible and extensible fashion. In addition to this general framework and format, a particular format, known as the reach package format, is defined using the general framework. The reach package format is a format for storing paginated documents. The contents of a reach package can be displayed or printed with full fidelity among devices and applications in a wide range of environments and across a wide range of scenarios.

RELATED APPLICATIONS

This application is a Continuation of co-pending application Ser. No.10/837,043, filed Apr. 30, 2004, entitled “Method and Apparatus forDocument Processing”, and incorporated herein by reference.

TECHNICAL FIELD

This invention relates to a content framework, document format andrelated methods and systems that can utilize both.

BACKGROUND

Typically today, there are many different types of content frameworks torepresent content, and many different types of document formats toformat various types of documents. Many times, each of these frameworksand formats requires its own associated software in order to build,produce, process or consume an i associated document. For those who havethe particular associated software installed on an appropriate device,building, producing, processing or consuming associated documents is notmuch of a problem. For those who do not have the appropriate software,building, producing, processing or consuming associated documents istypically not possible.

Against this backdrop, there is a continuing need for ubiquity insofaras production and consumption of documents is concerned.

SUMMARY

Modular content framework and document format methods and systems aredescribed. The described framework and format define a set of buildingblocks for composing, packaging, distributing, and renderingdocument-centered content. These building blocks define aplatform-independent framework for document formats that enable softwareand hardware systems to generate, exchange, and display documentsreliably and consistently. The framework and format have been designedin a flexible and extensible fashion.

In addition to this general framework and format, a particular format,known as the reach package format, is defined using the generalframework. The reach package format is a format for storing paginateddocuments. The contents of a reach package can be displayed or printedwith full fidelity among devices and applications in a wide range ofenvironments and across a wide range of scenarios.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of components of an exemplary framework andformat in accordance with one embodiment.

FIG. 2 is a block diagram of an exemplary package holding a documentcomprising a number of parts in accordance with one embodiment.

FIG. 3 is a block diagram that illustrates an exemplary writer thatproduces a package, and a reader that reads the package, in accordancewith one embodiment.

FIG. 4 illustrates an example part that binds together three separatepages.

FIG. 5 is a diagram that illustrates an exemplary selector and sequencesarranged to produce a financial report containing both an Englishrepresentation and a French representation of the report, in accordancewith one embodiment.

FIG. 6 illustrates some examples of writers and readers working togetherto communicate about a package, in accordance with one embodiment.

FIG. 7 illustrates an example of interleaving multiple parts of adocument.

FIGS. 8 and 9 illustrate different examples of packaging the multipleparts of the document shown in FIG. 7.

FIG. 10 illustrates an exemplary reach package and each of the validtypes of parts that can make up or be found in a package, in accordancewith one embodiment.

FIG. 11 illustrates an exemplary mapping of Common Language Runtimeconcepts to XML in accordance with one embodiment.

FIG. 12 illustrates both upright and sideways glyph metrics inaccordance with one embodiment.

FIG. 13 illustrates a one-to-one cluster map in accordance with oneembodiment.

FIG. 14 illustrates a many-to-one cluster map in accordance with oneembodiment.

FIG. 15 illustrates a one-to-many cluster map in accordance with oneembodiment.

FIG. 16 illustrates a many-to-many cluster map in accordance with oneembodiment.

DETAILED DESCRIPTION

Overview

This document describes a modular content framework and document format.The framework and format define a set of building blocks for composing,packaging, distributing, and rendering document-centered content. Thesebuilding blocks define a platform-independent framework for documentformats that enable software and hardware systems to generate, exchange,and display documents reliably and consistently. The framework andformat have been designed in a flexible and extensible fashion. Invarious embodiments, there is no restriction to the type of content thatcan be included, how the content is presented, or the platform on whichto build clients for handling the content.

In addition to this general framework, a particular format is definedusing the general framework. This format is referred to as the reachpackage format in this document, and is a format for storing paginatedor pre-paginated documents. The contents of a reach package can bedisplayed or printed with full fidelity among devices and applicationsin a wide range of environments and across a wide range of scenarios.

One of the goals of the framework described below is to ensure theinteroperability of independently-written software and hardware systemsreading or writing content produced in accordance with the framework andformat described below. In order to achieve this interoperability, thedescribed format defines formal requirements that systems that read orwrite content must satisfy.

The discussion below is organized along the following lines andpresented in two main sections—one entitled “The Framework” and oneentitled “The Reach Package Format”.

The section entitled “The Framework” presents an illustrative packagingmodel and describes the various parts and relationships that make upframework packages. Information about using descriptive metadata inframework packages is discussed, as well as the process of mapping tophysical containers, extending framework markup, and the use offramework versioning mechanisms.

The section entitled “The Reach Package Format” explores the structureof one particular type of framework-built package referred to as thereach package. This section also describes the package parts specific toa fixed payload and defines a reach package markup model and drawingmodel. This section concludes with exemplary reach markup elements andtheir properties along with illustrated samples.

As a high level overview of the discussion that follows, consider FIG. 1which illustrates aspects of the inventive framework and formatgenerally at 100. Certain exemplary components of the framework areillustrated at 102, and certain components of the reach package formatare illustrated at 104.

Framework 102 comprises exemplary components which include, withoutlimitation, a relationship component, a pluggable containers component,an interleaving/streaming component and a versioning/extensibilitycomponent, each of which is explored in more detail below. Reach packageformat 104 comprises components which include a selector/sequencercomponent and a package markup definition component.

In the discussion that follows below, periodic reference will be madeback to FIG. 1 so that the reader can maintain perspective as to wherethe described components fit in the framework and package format.

The Framework

In the discussion that follows, a description of a general framework isprovided. Separate primary sub-headings include “The Package Model”,“Composition Parts: Selector and Sequence”, “Descriptive Metadata”,“Physical Model”, “Physical Mappings” and “Versioning andExtensibility”. Each primary sub-heading has one or more relatedsub-headings.

The Package Model

This section describes the package model and includes sub-headings thatdescribe packages and parts, drivers, relationships, packagerelationships and the start part.

Packages and Parts

In the illustrated and described model, content is held within apackage. A package is a logical entity that holds a collection ofrelated parts. The package's purpose is to gather up all of the piecesof a document (or other types of content) into one object that is easyfor programmers and end-users to work with. For example, consider FIG. 2which illustrates an exemplary package 200 holding a document comprisinga number of parts including an XML markup part 202 representing thedocument, a font part 204 describing a font that is used in thedocument, a number of page parts 206 describing pages of the document,and a picture part representing a picture within the document. The XMLmarkup part 202 that represents a document is advantageous in that itcan permit easy searchability and referencing without requiring theentire content of a package to be parsed. This will become more apparentbelow.

Throughout this document the notion of readers (also referred to asconsumers) and writers (also referred to as producers) is introduced anddiscussed. A reader, as that term is used in this document, refers to anentity that reads modular content format-based files or packages. Awriter, as that term is used in this document, refers to an entity thatwrites modular content format-based files or packages. As an example,consider FIG. 3, which shows a writer that produces a package and areader that reads a package. Typically, the writer and reader will beembodied as software. In at least one embodiment, much of the processingoverhead and complexities associated with creating and formattingpackages is placed on the writer. This, in turn, removes much of theprocessing complexity and overhead from readers which, as will beappreciated by the skilled artisan, is a departure from many currentmodels. This aspect will become apparent below.

In accordance with at least one embodiment, a single package containsone or more representations of the content held in the package. Often apackage will ii be a single file, referred to in this application as acontainer. This gives end-users, for example, a convenient way todistribute their documents with all of the component pieces of thedocument (images, fonts, data, etc.). While packages often corresponddirectly to a single file, this is not necessarily always so. A packageis a logical entity that may be represented physically in a variety ofways (e.g., without limitation, in a single file, a collection of loosefiles, in a database, ephemerally in transit over a network connection,etc.). Thus containers hold packages, but not all packages are stored incontainers.

An abstract model describes packages independently of any physicalstorage mechanism. For example, the abstract model does not refer to“files”, “streams”, or other physical terms related to the physicalworld in which the package is located. As discussed below, the abstractmodel allows users to create drivers for various physical formats,communication protocols, and the like. By analogy, when an applicationwants to print an image, it uses an abstraction of a printer (presentedby the driver that understands the specific kind of printer). Thus, theapplication is not required to know about the specific printing deviceor how to communicate with the printing device.

A container provides many benefits over what might otherwise be acollection of loose, disconnected files. For example, similar componentsmay be aggregated and content may be indexed and compressed. Inaddition, relationships between components may be identified and rightsmanagement, digital signatures, encryption and metadata may be appliedto components. Of course, containers can be used for and can embodyother features which are not specifically enumerated above.

Common Part Properties

In the illustrated and described embodiment, a part comprises commonproperties (e.g., name) and a stream of bytes. This is analogous to afile in a file system or a resource on an HTTP server. In addition toits content, each part has some common part properties. These include aname—which is the name of the part, and a content type—which is the typeof content stored in the part. Parts may also have one or moreassociated relationships, as discussed below.

Part names are used whenever it is necessary to refer in some way to apart. In the illustrated and described embodiment, names are organizedinto a hierarchy, similar to paths on a file system or paths in URIs.Below are examples of part names: /document.xml /tickets/ticket.xml/images/march/summer.jpeg /pages/page4.xml

As seen above, in this embodiment, part names have the followingcharacteristics:

-   -   Part names are similar to file names in a traditional file        system.    -   Part names begin with a forward slash (‘/’).    -   Like paths in a file-system or paths in a URI, part names can be        organized into a hierarchy by a set of directory-like names        (tickets, images/march and pages in the above examples).    -   This hierarchy is composed of segments delineated by slashes.    -   The last segment of the name is similar to a filename a        traditional file-system.

It is important to note that the rules for naming parts, especially thevalid characters that can be used for part names, are specific to theframework described in this document. These part name rules are based oninternet-standard URI naming rules. In accordance with this embodiment,the grammar used for specifying part names in this embodiment exactlymatches abs_path syntax defined in Sections 3.3 (Path Component) and 5(Relative URI References) of RFC2396, (Uniform Resource Identifiers(URI: Generic Syntax) specification.

The following additional restrictions are applied to abs_path as a validpart name:

-   -   Query Component, as it is defined in Sections 3 (URI Syntactic        Components) and 3.4 (Query Component), is not applicable to a        part name.    -   Fragment identifier, as it is described in Section 4.1 (Fragment        Identifier), is not applicable to a part name.    -   It is illegal to have any part with a name created by        appending * (“/” segment ) to the part name of an existing part.

Grammar for part names is shown below: part_name = “/” segment * ( “/”segment ) segment = *pchar pchar = unreserved | escaped | “:” | “@” |“&” | “=” | “+” | “$” | “,” unreserved = alphanum | mark escaped = “%”hex hex hex = digit | “A” | “B” | “C” | “D” | “E” | “F” | “a” | “b” |“c” | “d” | “e” | “f” mark = “-” | “_” | “.” | “!” | “˜” | “*” | “'” |“(” | “)” alpha = lowalpha | upalpha lowalpha = “a” | “b” | “c” | “d” |“e” | “f” | “g” | “h” | “i” | “j” | “k” | “l” | “m” | “n” | “o” | “p” |“q” | “r” | “s” | “t” | “u” | “v” | “w” | “x” | “y” | “z” upalpha = “A”| “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” | “J” | “K” | “L” | “M”| “N” | “O” | “P” | “Q” | “R” | “S” | “T” | “U” | “V” | “W” | “X” | “Y”| “Z” digit = “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”alphanum = alpha | digit

The segments of the names of all parts in a package can be seen to forma tree. This is analogous to what happens in file systems, in which allof the non-leaf nodes in the tree are folders and the leaf nodes are theactual files containing content. These folder-like nodes (i.e., non-leafnodes) in the name tree serve a similar function of organizing the partsin the package. It is important to remember, however, that these“folders” exist only as a concept in the naming hierarchy—they have noother manifestation in the persistence format.

Part names can not live at the “folder” level. Specifically, non-leafnodes in the part naming hierarchy (“folder”) cannot contain a part anda subfolder with the same name.

In the illustrated and described embodiment, every part has a contenttype which identifies what type of content is stored in a part. Examplesof content types include: image/jpeg text/xml text/plain;charset=“us-ascii”

Content types are used in the illustrated framework as defined inRFC2045 (Multipurpose Internet Mail Extensions; (MIME)). Specifically,each content type includes a media type (e.g., text), a subtype (e.g.,plain) and an optional set of parameters in key=value form (e.g.,charset=“us-ascii”); multiple parameters are separated by semicolons.

Part Addressing

Often parts will contain references to other parts. As a simple example,imagine a container with two parts: a markup file and an image. Themarkup file will want to hold a reference to the image so that when themarkup file is processed, the associated image can be identified andlocated. Designers of content types and XML schemas may use URIs torepresent these references. To make this possible, a mapping between theworld of part names and world of URIs needs to be defined.

In order to allow the use of URIs in a package, a special URIinterpretation rule must be used when evaluating URIs in package-basedcontent: the package itself should be treated as the “authority” for URIreferences and the path component of the URI is used to navigate thepart name hierarchy in the package.

For example, given a package URI ofhttp://www.example.com/foo/something.package, a reference to/abc/bar.xml is interpreted to mean the part called /abc/bar.xml, notthe URI http://www.example.com/abc/bar.xml.

Relative URIs should be used when it is necessary to have a referencefrom one part to another in a container. Using relative referencesallows the contents of the container to be moved together into adifferent container (or into the container from, for example, the filesystem) without modifying the cross-part references.

Relative references from a part are interpreted relative to the “baseURI” of the part containing the reference. By default, the base URI of apart is the part's name.

Consider a container which includes parts with the following names:/markup/page.xml /images/picture.jpeg /images/other_picture.jpeg

If the “/markup/page.xml” part contains a URI reference to“../images/picture.jpeg”, then this reference must be interpreted asreferring to the part name “/images/picture.jpeg”, according to therules above.

Some content types provide a way to override the default base URI byspecifying a different base in the content. In the presence of one ofthese overrides, the explicitly specified base URI should be usedinstead of the default.

Sometimes it is useful to “address” a portion or specific point in apart. In the URI world, a fragment identifier is used [see, e.g.RFC2396]. In a container, the mechanism works the same way.Specifically, the fragment is a string that contains additionalinformation that is understood in the context of the content type of theaddressed part. For example, in a video file a fragment might identify aframe, in an XML file it might identify a portion of the XML file via anxpath.

A fragment identifier is used in conjunction with a URI that addresses apart to identify fragments of the addressed part. The fragmentidentifier is optional and is separated from the URI by a crosshatch(“#”) character. As such, it is not part of a URI, but is often used inconjunction with a URI.

The following discussion provides some guidance for part naming, as thepackage and part naming model is quite flexible. This flexibility allowsfor a wide range of applications of a framework package. However, it isimportant to recognize that the framework is designed to enablescenarios in which multiple, unrelated software systems can manipulate“their own” parts of a package without colliding with each other. Toallow this, certain guidelines are provided which, if followed, makethis possible.

The guidelines given here describe a mechanism for minimizing or atleast reducing the occurrences of part naming conflicts, and dealingwith them when they do arise. Writers creating parts in a package musttake steps to detect and handle naming conflicts with existing parts inthe package. In the event that a name conflict arises, writers may notblindly replace existing parts.

In situations where a package is guaranteed to be manipulated by asingle writer, that writer may deviate from these guidelines. However,if there is a possibility of multiple independent writers sharing apackage, all writers must follow these guidelines. It is recommended,however, that all writers follow these guidelines in any case.

-   -   It is required that writers adding parts into an existing        container do so in a new “folder” of the naming hierarchy,        rather than placing parts directly in the root, or in a        pre-existing folder. In this way, the possibility of name        conflicts is limited to the first segment of the part name.        Parts created within this new folder can be named without        risking conflicts with existing parts.    -   In the event that the “preferred” name for the folder is already        used by an existing part, a writer must adopt some strategy for        choosing alternate folder names. Writers should use the strategy        of appending digits to the preferred name until an available        folder name is found (possibly resorting to a GUID after some        number of unsuccessful iterations).    -   One consequence of this policy is that readers must not attempt        to locate a part via a “magic” or “well known” part name.        Instead, writers must create a package relationship to at least        one part in each folder they create. Readers must use these        package relationships to locate the parts rather than relying on        well known names.    -   Once a reader has found at least one part in a folder (via one        of the aforementioned package relationships) it may use        conventions about well known part names within that folder to        find other parts.

Drivers

The file format described herein can be used by different applications,different document types, etc.—many of which have conflicting uses,conflicting formats, and the like. One or more drivers are used toresolve various conflicts, such as differences in file formats,differences in communication protocols, and the like. For example,different file formats include loose files and compound files, anddifferent communication protocols include http, network, and wirelessprotocols. A group of drivers abstract various file formats andcommunication protocols into a single model. Multiple drivers can beprovided for different scenarios, different customer requirements,different physical configurations, etc.

Relationships

Parts in a package may contain references to other parts in thatpackage. In general, however, these references are represented insidethe referring part in ways that are specific to the content type of thepart; that is, in arbitrary markup or an application-specific encoding.This effectively hides the internal linkages between parts from readersthat don't understand the content types of the parts containing suchreferences.

Even for common content types (such as the Fixed Payload markupdescribed in the Reach Package section), a reader would need to parseall of the content in a part to discover and resolve the references toother parts. For example, when implementing a print system that printsdocuments one page at a time, it may be desirable to identify picturesand fonts contained in the particular page. Existing systems must parseall information for each page, which can be time consuming, and mustunderstand the language of each page, which may not be the situationwith certain devices or readers (e.g., ones that are performingintermediate processing on the document as it passes through a pipelineof processors on the way to a device). Instead, the systems and methodsdescribed herein use relationships to identify relationships betweenparts and to describe the nature of those relationships. Therelationship language is simple and defined once so that readers canunderstand relationships without requiring knowledge of multipledifferent languages. In one embodiment, the relationships arerepresented in XML as individual parts. Each part has an associatedrelationship part that contains the relationships for which the part isa source.

For example, a spreadsheet application uses this format and storesdifferent spreadsheets as parts. An application that knows nothing aboutthe spreadsheet language can still discover various relationshipsassociated with the spreadsheets. For example, the application candiscover images in the spreadsheets and metadata associated with thespreadsheets. An example relationship schema is provided below: <?xmlversion=“1.0”?> <xsd:schema xmlns:mmcfrels=“http://mmcfrels-PLACEHOLDER”xmlns:xsd=“http://www.w3.org/2001/XMLSchema”>  <xsd:attributename=“Target” type=“xsd:string”/>  <xsd:attribute name=“Name”type=“xsd:string”/>  <xsd:element name=“Relationships”>  <xsd:complexType>    <xsd:sequence>     <xsd:elementref=“Relationship”     minOccurs=“0” maxOccurs=“unbounded”/>   </xsd:sequence>   </xsd:complexType>  </xsd:element>  <xsd:elementname=“Relationship”>   <xsd:complexType>    <xsd:simpleContent>    <xsd:extension base=“xsd:string”>      <xsd:attribute ref=“Target”/>     <xsd:attribute ref=“Name”/>     </xsd:extension>   </xsd:simpleContent>   </xsd:complexType>  <xsd:element> <xsd:schema>

This schema defines two XML elements, one called “relationships” and onecalled “relationship.” This “relationship” element is used to describe asingle relationship as described herein and has the followingattributes: (1) “target,” which indicates the part to which the sourcepart is related, (2) “name” which indicates the type or nature of therelationship. The “relationships” element is defined to allow it to holdzero or more “relationship” elements and serves simply to collect these“relationship” elements together in a unit.

The systems and methods described herein introduce a higher-levelmechanism to solve these problems called “relationships”. Relationshipsprovide an additional way to represent the kind of connection between asource part and a target part in a package. Relationships make theconnections between parts directly “discoverable” without looking at thecontent in the parts, so they are independent of content-specific schemaand faster to resolve. Additionally, these relationships are protocolindependent. A variety of different relationships may be associated witha particular part.

Relationships provide a second important function: allowing parts to berelated without modifying them. Sometimes this information serves as aform of “annotation” where the content type of the “annotated” part doesnot define a way to attach the given information. Potential examplesinclude attached descriptive metadata, print tickets and trueannotations. Finally, some scenarios require information to be attachedto an existing part specifically without modifying that part—forexample, when the part is encrypted and can not be decrypted or when thepart is digitally signed and changing it would invalidate the signature.In another example, a user may want to attach an annotation to a JPEGimage file. The JPEG image format does not currently provide support foridentifying annotations. Changing the JPEG format to accommodate thisuser's desire is not practical. However, the systems and methodsdiscussed herein allow the user to provide an annotation to a JPEG filewithout modifying the JPEG image format.

In one embodiment, relationships are represented using XML inrelationship parts. Each part in the container that is the source of oneor more relationships has an associated relationship part. Thisrelationship part holds (expressed in XML using the content typeapplication/PLACEHOLDER) the list of relationships for that source part.

FIG. 4 below shows an environment 400 in which a “spine” part 402(similar to a FixedPanel) binds together three pages 406, 408 and 410.The set of pages bound together by the spine has an associated “printticket” 404. Additionally, page 2 has its own print ticket 412. Theconnections from the spine part 402 to its print ticket 404 and frompage 2 to its print ticket 412 are represented using relationships. Inthe arrangement of FIG. 4, the spine part 402 would have an associatedrelationship part which contained a relationship that connects the spineto ticket1, as shown in the example below. <Relationshipsxmlns=”http://mmcfrels-PLACEHOLDER”>  <Relationship  Target=”../tickets/ticket1.xml”  Name=”http://mmcf-printing-ticket/PLACEHOLDER”/> </Relationships>

Relationships are represented using <Relationship> elements nested in asingle <Relationships> element. These elements are defined in thehttp://mmcfrels (PLACEHOLDER) namespace. See the example schema above,and related discussion, for example relationships.

The relationship element has the following additional attributes:Attribute Required Meaning Target Yes A URI that points to the part atthe other end of the relationship. Relative URIs MUST be interpretedrelative to the source part. Name Yes An absolute URI that uniquelydefines the role or purpose of the relationship.

The Name attribute is not necessarily an actual address. Different typesof relationships are identified by their Names. These names are definedin the same way that namespaces are defined for XML namespaces.Specifically, by using names patterned after the Internet domain namespace, non-coordinating parties can safely create non-conflictingrelationship names—just as they can for XML namespaces.

The relationships part is not permitted to participate in otherrelationships. However, it is a first class part in all other senses(e.g., it is URI addressable, it can be opened, read, deleted, etc.).Relationships do not typically point at things outside the package. URIsused to identify relationship targets do not generally include a URIscheme.

A part and its associated relationship part are connected by a namingconvention. In this example, the relationship part for the spine wouldbe stored in /content/_rels/spine.xml.rels and the relationships forpage 2 would be stored in /content/_rels/p2.xml.rels. Note two specialnaming conventions being used here. First, the relationship part forsome (other) part in a given “folder” in the name hierarchy is stored ina “sub-folder” called _rels (to identify relationships). Second, thename of this relationship-holding part is formed by appending the .relsextension to the name of the original part. In particular embodiments,relationship parts are of the content typeapplication/xml+relationshipsPLACEHOLDER.

A relationship represents a directed connection between two parts.Because of the way that the relationship is being represented, it isefficient to traverse relationships from their source parts (since it istrivial to find the relationships part for any given part). However, itis not efficient to traverse relationships backwards from the target ofthe relationship (since the way to find all of the relationships to apart is to look through all of the relationships in the container).

In order to make backwards traversal of a relationship possible, a newrelationship is used to represent the other (traversable) direction.This is a modeling technique that the designer of a type of relationshipcan use. Following the example above, if it were important to be able tofind the spine that has ticket1 attached, a second relationship would beused connecting from the ticket to the spine, such as: Incontent/_rels/p1.xml.rels: <Relationshipsxmlns=”http://mmcfrels-PLACEHOLDER”>  <Relationship  Target=”/content/spine.xml”  Name=”http://mmcf-printing-spine/PLACEHOLDER”/> </Relationships>

Package Relationships

“Package Relationships” are used to find well-known parts in a package.This method avoids relying on naming conventions for finding parts in apackage, and ensures that there will not be collisions between identicalpart names in different payloads.

Package relationships are special relationships whose target is a part,but whose source is not: the source is the package as a whole. To have a“well-known” part is really to have a “well-known” relationship namethat helps you find that part. This works because there is awell-defined mechanism to allow relationships to be named bynon-coordinating parties, while certain embodiments contain no suchmechanism for part name—those embodiments are limited to a set ofguidelines. The package relationships are found in the packagerelationships part and is named using the standard naming conventionsfor relationship parts. Thus: it's named “/_rels/.rels”

Relationships in this package relationships part are useful in findingwell-known parts.

The Start Part

One example of a package-level, well-known part is the package “start”part. This is the part that is typically processed when a package isopened. It represents the logical root of the document content stored inthe package. The start part of a package is located by following awell-known package relationship. In one example, this relationship hasthe following name: http://mmcf-start-part-PLACEHOLDER.

Composition Parts: Selector and Sequence

The described framework defines two mechanisms for building higher-orderstructures from parts: selectors and sequences.

A selector is a part which “selects” between a number of other parts.For example, a selector part might “select” between a part representingthe English version of a document and a part representing the Frenchversion of a document. A sequence is a part which “sequences” a numberof other parts. For example, a sequence part might combine (into alinear sequence) two parts, one of which represents a five-page documentand one of which represents a ten-page document.

These two types of composition parts (sequence and selector) and therules for assembling them comprise a composition model. Compositionparts can compose other composition parts, so one could have, forexample, a selector that selects between two compositions. As anexample, consider FIG. 5, which shows and example of a financial reportcontaining both an English representation and a French representation.Each of these representations is further composed of an introduction (acover page) followed by the financials (a spreadsheet). In this example,a selector 500 selects between the English and French representation ofthe report. If the English representation is selected, sequence 502sequences the English introduction part 506 with the English financialpart 508. Alternately, if the French representation is selected,sequence 504 sequences the French introduction part 510 with the Frenchfinancial part 512.

Composition Part XML

In the illustrated and described embodiment, composition parts aredescribed using a small number of XML elements, all drawn from a commoncomposition namespace. As an example, consider the following: Element:<selection> Attributes: None Allowed Child Elements: <item> Element:<sequence> Attributes: None Allowed Child Elements: <item> Element:<item> Attributes: Target - the part name of a part in the composition

As an example, here is the XML for the example of FIG. 5 above:MainDocument.XML   <selection>     <item target=”EnglishRollup.xml”/>    <item target=”FrenchRollup.xml”/>   </selection> EnglishRollup.XML  <sequence>     <item target=”EnglishIntroduction.xml”/>     <itemtarget=”EnglishFinancials.xml”/>   </sequence> FrenchRollup.XML  <sequence>     <item target=”FrenchIntroduction.xml”>     <itemtarget=”FrenchFinancials.xml”>   </sequence>

In this XML, MainDocument.xml represents an entire part in the packageand indicates, by virtue of the “selection” tag, that a selection is tobe made between different items encapsulated by the “item” tag, i.e.,the “EnglishRollup.xml” and the “FrenchRollup.xml”.

The EnglishRollup.xml and FrenchRollup.xml are, by virtue of the“sequence” tags, sequences that sequence together the respective itemsencapsulated by their respective “item” tags.

Thus, a simple XML grammar is provided for describing selectors andsequences. Each part in this composition block is built and performs oneoperation—either selecting or sequencing. By using a hierarchy of parts,different robust collections of selections and sequences can be built.

Composition Block

The composition block of a package comprises the set of all compositionparts (selector or sequence) that are reachable from the starting partof the package. If the starting part of the package is neither aselector nor a sequence, then the composition block is considered empty.If the starting part is a composition part, then the child <item>s inthose composition parts are recursively traversed to produce a directed,acyclic graph of the composition parts (stopping traversal when anon-composition part is encountered). This graph is the compositionblock (and it must, in accordance with this embodiment, be acyclic forthe package to be valid).

Determining Composition Semantics

Having established the relatively straight forward XML grammar above,the following discussion describes a way to represent the informationsuch that selections can be made based on content type. That is, the XMLdescribed above provides enough information to allow readers to locatethe parts that are assembled together into a composition, but does notprovide enough information to help a reader know more about the natureof the composition. For example, given a selection that composes twoparts, how does a reader know on what basis (e.g., language, paper size,etc.) to make the selection? The answer is that these rules areassociated with the content type of the composition part. Thus, aselector part that is used for picking between representations based onlanguage will have a different associated content type from a selectorpart that picks between representations based on paper sizes.

The general framework defines the general form for these content types:Application/XML+Selector−SOMETHINGApplication/XML+Sequence−SOMETHING

The SOMETHING in these content types is replaced by a word thatindicates the nature of the selection or sequence, e.g. page size,color, language, resident software on a reader device and the like. Inthis framework then, one can invent all kinds of selectors and sequencesand each can have very different semantics.

The described framework also defines the following well-known contenttypes for selectors and sequences that all readers or reading devicesmust understand. Content Type Rules Application/XML + Pick between theitems based on their content Selector + types. Select the first item forwhich SupportedContentType software is available that understands thegiven content type.

As an example, consider the following. Assume a package contains adocument that has a page, and in the middle of the page there is an areain which a video is to appear. In this example, a video part of the pagemight comprise video in the form of a Quicktime video. One problem withthis scenario is that Quicktime videos are not universally understood.Assume, however, that in accordance with this framework and, moreparticularly, the reach package format described below, there is auniversally understood image format—JPEG. When producing the packagethat contains the document described above, the producer might, inaddition to defining the video as a part of the package, define a JPEGimage for the page and interpose a SupportedContentType selector so thatif the user's computer has software that understands the Quicktimevideo, the Quicktime video is selected, otherwise the JPEG image isselected.

Thus, as described above, the framework-level selector and sequencecomponents allow a robust hierarchy to be built which, in this example,is defined in XML. In addition, there is a well-defined way to identifythe behaviors of selectors and sequences using content types.Additionally, in accordance with one embodiment, the general frameworkcomprises one particular content type that is predefined and whichallows processing and utilization of packages based on what a consumer(e.g. a reader or reading device) does and does not understand.

Other composition part content types can be defined using similar rules,examples of which are discussed below.

Descriptive Metadata

In accordance with one embodiment, descriptive metadata parts providewriters or producers of packages with a way in which to store values ofproperties that enable readers of the packages to reliably discover thevalues. These properties are typically used to record additionalinformation about the package as a whole, as well as individual partswithin the container. For example, a descriptive metadata part in apackage might hold information such as the author of the package,keywords, a summary, and the like.

In the illustrated and described embodiment, the descriptive metadata isexpressed in XML, is stored in parts with well-known content types, andcan be found using well-known relationship types.

Descriptive metadata holds metadata properties. Metadata properties arerepresented by a property name and one or many property values. Propertyvalues have simple data types, so each data type is described by asingle XML qname. The fact that descriptive metadata properties havesimple types does not mean that one cannot store data with complex XMLtypes in a package. In this case, one must store the information as afull XML part. When this is done, all constraints about only usingsimple types are removed, but the simplicity of the “flat” descriptivemetadata property model is lost.

In addition to the general purpose mechanism for defining sets ofproperties, there is a specific, well-defined set of document coreproperties, stored using this mechanism. These document core propertiesare commonly used to describe documents and include properties liketitle, keywords, author, etc.

Finally, metadata parts holding these document core properties can alsohold additional, custom-defined properties in addition to the documentcore properties.

Metadata Format

In accordance with one embodiment, descriptive metadata parts have acontent type and are targeted by relationships according to thefollowing rules: Using Using Custom- Document Descriptive MetadataDiscovery defined Core Rules properties properties Content type of adescriptive metadata part MUST application/xml-SimpleTypeProperties- be:PLACEHOLDER Content type of a source part which can have ANY ANYrelationship targeting descriptive metadata part may be: Name of therelationship targeting descriptive *custom-defined Uri- http://mmcf-metadata part may be either: namespace* DocumentCore- PLACEHOLDER Numberof descriptive metadata parts, which can UNBOUNDED 0 or 1 be attached tothe source part may be: Number of source parts which can have the sameUNBOUNDED UNBOUNDED descriptive metadata part attached MUST be

The following XML pattern is used to represent descriptive metadata inaccordance with one embodiment. Details about each component of themarkup are given in the table after the sample. <mcs:propertiesxmlns:mcs=”http://mmcf-core-services/ PLACEHOLDER”xmlns:xsd=“http://www.w3.org/2001/XMLSchema”>   <mcs:property prns:name= “property name”   xmlns:prns=”property namespace” mcs:type=”datatype”mcs:multivalued=”true |false”>     <mcs:value> ... value ...</mcs:value>   </mcs:property> </mcs:properties>

Markup Component Description xmlns:mcs=”http://mmcf-common- Defines theMMCF common services namespace services/PLACEHOLDER”xmlns:xsd=”http://www.w3.org/2001/ Defines the XML schema namespace.Many custom-defined XMLSchema” properties and the majority of DocumentCore properties will have built-in data types defined using an XSD.Although each property can have its own namespace, the XSD namespace isplaced on the root of the descriptive metadata XML. mcs:properties Rootelement of the descriptive metadata XML mcs:property Property element. Aproperty element holds a property qname and value. There may be anunbounded number of property elements. Property elements are consideredto be immediate children of the root element. xmlns:prns PropertyNamespace: For Document Core properties it ishttp://mmcf-DocumentCore-PLACEHOLDER. For custom-defined properties itwill be a custom namespace. prns:name Property Name: string attributewhich holds property name mcs:type=”datatype” Type is the stringattribute that holds the property datatype definition, e.g. xsd:stringmcs:value This component specifies the value of the property. Valueelements are immediate children of property elements. Ifmcs:multivalued=”true”, then there may be an unbounded number of valueelements.

Document Core Properties

The following is a table of document core properties that includes thename of the property, the property type and a description. Name TypeDescription Comments String, optional, single-valued A comment to thedocument as a whole that an Author includes. This may be a summary ofthe document. Copyright String, optional, single-valued Copyright stringfor this document EditingTime Int64, optional, single-valued Time spentediting this document in seconds. Set by application logic. This valuemust have the appropriate type. IsCurrentVersion boolean, optional,single-valued Indicates if this instance is a current version of thedocument, or an obsolete version. This field can be derived fromVersionHistory, but the derivation process may be expensive. LanguageKeyword (= string256), optional, The language of the document (English,French, multi-valued etc.). This field is set by the application logic.RevisionNumber String, optional, single-valued Revision of the document.Subtitle String, optional, single-valued A secondary or explanatorytitle of the document TextDataProperties TextDataProperties, optional,If this document has text, this property defines a single-valuedcollection of the text properties of the document, CharacterCount int64such as paragraph count, line count, etc LineCount int64 PageCount int64ParagraphCount int64 WordCount int64 TimeLastPrinted datetime, optional,single-valued Date and time when this document was last printed. TitleString, optional, single-valued The document title, as understood by theapplication that handles the document. This is different than the nameof the file that contains the package. TitleSortOrder String, optional,single-valued The sort order of the title (e.g. “The Beatles” will haveSortOrder “Beatles”, with no leading “The”). ContentType Keyword (=string256), optional, Document type as set by application logic. Themulti-valued type that is stored here should be a recognized “mime-type”This property may be useful for categorizing or searching for documentsof certain types.

Physical Model

The physical model defines various ways in which a package is used bywriters and readers. This model is based on three components: a writer,a reader and a pipe between them. FIG. 6 shows some examples of writersand readers working together to communicate about a package.

The pipe carries data from the writer to the reader. In many scenarios,the pipe can simply comprise the API calls that the reader makes to readthe package from the local file system. This is referred to as directaccess.

Often, however, the reader and the writer must communicate with eachother over some type of protocol. This communication happens, forexample, across a process boundary or between a server and a desktopcomputer. This is referred to as networked access and is importantbecause of the communications characteristics of the pipe (specifically,the speed and request latency).

In order to enable maximum performance, physical package designs mustconsider support in three important areas: access style, layout styleand communication style.

Access Style

Streaming Consumption

Because communication between the writer and the reader using networkedaccess is not instantaneous, it is important to allow for progressivecreation and consumption of packages. In particular, it is recommended,in accordance with this embodiment, that any physical package format bedesigned to allow a reader to begin interpreting and processing the datait receives the data (e.g., parts), before all of the bits of thepackage have been delivered through the pipe. This capability is calledstreaming consumption.

Streaming Creation

When a writer begins to create a package, it does not always know whatit will be putting in the package. As an example, when an applicationbegins to build a print spool file package, it may not know how manypages will need to be put into the package. As another example, aprogram on a server that is dynamically generating a report may notrealize how long the report will be or how many pictures the report willhave—until it has completely generated the report. In order to allowwriters like this, physical packages should allow writers to dynamicallyadd parts after other parts have already been added (for example, awriter must not be required to state up front how many parts it will becreating when it starts writing). Additionally, physical packages shouldallow a writer to begin writing the contents of a part without knowingthe ultimate length of that part. Together, these requirements enablestreaming creation.

Simultaneous Creation and Consumption

In a highly-pipelined architecture, streaming creation and streamingconsumption can occur simultaneously for a specific package. Whendesigning a physical package, supporting streaming creation andsupporting streaming consumption can push a design in oppositedirections. However, it is often possible to find a design that supportsboth. Because of the benefits in a pipelined architecture, it isrecommended that physical packages support simultaneous creation andconsumption.

Layout Styles

Physical packages hold a collection of parts. These parts can be laidout in one of two styles: simple ordering and interleaved. With simpleordering, the parts in the package are laid out with a defined ordering.When such a package is delivered in a pure linear fashion, starting withthe first byte in the package through to the last, all of the bytes forthe first part arrive first, then all of the bytes for the second part,and so on.

With interleaved layout, the bytes of the multiple parts areinterleaved, allowing for improved performance in certain scenarios. Twoscenarios that benefit significantly from interleaving are multi-mediaplayback (e.g., delivering video and audio at the same time) and inlineresource reference (e.g., a reference in the middle of a markup file toan image).

Interleaving is handled through a special convention for organizing thecontents of interleaved parts. By breaking parts into pieces andinterleaving these pieces, it is possible to achieve the desired resultsof interleaving while still making it possible to easily reconstruct theoriginal larger part. To understand how interleaving works, FIG. 7illustrates a simple example involving two parts: content.xml 702 andimage.jpeg 704. The first part, content.xml, describes the contents of apage and in the middle of that page is a reference to an image(image.jpeg) that should appear on the page.

To understand why interleaving is valuable, consider how these partswould be arranged in a package using simple ordering, as shown in FIG.8. A reader that is processing this package (and is receiving bytessequentially) will be unable to display the picture until it hasreceived all of the content.xml part as well as the image.jpeg. In somecircumstances (e.g., small or simple packages, or fast communicationslinks) this may not be a problem. In other circumstances (for example,if content.xml was very large or the communications link was very slow),needing to read through all of the content.xml part to get to the imagewill result in unacceptable performance or place unreasonable memorydemands on the reader system.

In order to achieve closer to ideal performance, it would be nice to beable to split the content.xml part and insert the image.jpeg part intothe middle, right after where the picture is referenced. This wouldallow the reader to begin processing the image earlier: as soon as itencounters the reference, the image data follows. This would produce,for example, the package layout shown in FIG. 9. Because of theperformance benefits, it is often desirable that physical packagessupport interleaving. Depending on the kind of physical package beingused, interleaving may or may not be supported. Different physicalpackages may handle the internal representation of interleavingdifferently. Regardless of how the physical package handlesinterleaving, it's important to remember that interleaving is anoptimization that occurs at the physical level and a part that is brokeninto multiple pieces in the physical file is still one logical part; thepieces themselves aren't parts.

Communication Styles

Communication between writer and reader can be based on sequentialdelivery of parts or by random-access to parts, allowing them to beaccessed out of order. Which of these communication styles is utilizeddepends on the capabilities of both the pipe and the physical packageformat. Generally, all pipes will support sequential delivery. Physicalpackages must support sequential delivery. To support random-accessscenarios, both the pipe in use and the physical package must supportrandom-access. Some pipes are based on protocols that can enable randomaccess (e.g., HTTP 1.1 with byte-range support). In order to allowmaximum performance when these pipes are in use, it is recommended thatphysical packages support random-access. In the absence of this support,readers will simply wait until the parts they need are deliveredsequentially.

Physical Mappings

The logical packaging model defines a package abstraction; an actualinstance of a package is based on some particular physicalrepresentation of a package. The packaging model may be mapped tophysical persistence formats, as well as to various transports (e.g.,network-based protocols). A physical package format can be described asa mapping from the components of the abstract packaging model to thefeatures of a particular physical format. The packaging model does notspecify which physical package formats should be used for archiving,distributing, or spooling packages. In one embodiment, only the logicalstructure is specified. A package may be “physically” embodied by acollection of loose files, a .ZIP file archive, a compound file, or someother format. The format chosen is supported by the targeted consumingdevice, or by a driver for the device.

Components being Mapped

Each physical package format defines a mapping for the followingcomponents. Some components are optional and a specific physical packageformat may not support these optional components. Required or ComponentDescription Optional Parts Name Names a part. Required Content typeIdentified the kind of content stored Required in the part. Partcontents Stores the actual content of the part. Required

Common Mapping Patterns Access Styles Streaming Allows readers to beginprocessing Optional Consumption parts before the entire package hasarrived. Streaming Allows writers to begin writing parts to OptionalCreation the package without knowing, in advance, all of the parts thatwill be written. Simultaneous Allows streaming creation and OptionalCreation and streaming consumption to happen at Consumption the sametime on the same package. Layout Styles Simple All of the bytes for partN appear in Optional Ordering the package before the bytes for partN + 1. Interleaved The bytes for multiple parts are Optionalinterleaved. Communication Sequential All of part N is delivered to areader Optional Styles Delivery before part N + 1. Random- A reader canrequest the delivery of a Optional Access part out of sequential order.

There exist many physical storage formats whose features partially matchthe packaging-model components. In defining mappings from the packagingmodel to such storage formats, it may be desirable to take advantage ofany similarities in capabilities between the packaging model and thephysical storage medium, while using layers of mapping to provideadditional capabilities not inherently present in the physical storagemedium. For example, some physical package formats may store individualparts as individual files in a file system. In such a physical format,it would be natural to map many part names directly to identicalphysical file names. Part names using characters which are not validfile system file names may require some kind of escaping mechanism.

In many cases, a single common mapping problem may be faced by thedesigners of different physical package formats. Two examples of commonmapping problems arise when associating arbitrary Content Types withparts, and when supporting the Interleaved layout style. Thisspecification suggests common solutions to such common mapping problems.Designers of specific physical package formats may be encouraged, butare not required, to use the common mapping solutions defined here.

Identifying Content Types of Parts

Physical package format mappings define a mechanism for storing acontent type for each part. Some physical package formats have a nativemechanism for representing content types (for example, the“Content-Type” header in MIME). For such physical packages, it isrecommended that the mapping use the native mechanism to representcontent types for parts. For other physical package formats, some othermechanism is used to represent content types. The recommended mechanismfor representing content types in these packages is by including aspecially-named XML stream in the package, known as the types stream.This stream is not a part, and is therefore not itself URI-addressable.However, it can be interleaved in the physical package using the samemechanisms used for interleaving parts.

The types stream contains XML with a top level “Types” element, and oneor more “Default” and “Override” sub-elements. The “Default” elementsdefine default mappings from part name extensions to content types. Thistakes advantage of the fact that file extensions often correspond tocontent type. “Override” elements are used to specify content types onparts that are not covered by, or are not consistent with, the defaultmappings. Package writers may use “Default” elements to reduce thenumber of per-part “Override” elements, but are not required to do so.

The “Default” element has the following attributes: Name DescriptionRequired Extension A part name extension. A Yes “Default” elementmatches any part whose name ends with a period followed by thisattribute's value. ContentType A content type as defined in Yes RFC2045.Indicates the content type of any matching parts (unless overridden byan “Override” element; see below).

The “Override” element has the following attributes: Name DescriptionRequired PartName A part name URI. An Yes “Override” element matches thepart whose name equals this attribute's value. ContentType A contenttype as defined in Yes RFC2045. Indicates the content type of thematching part.

The following is an example of the XML contained in a types stream:<Types xmlns=“http://mmcfcontent-PLACEHOLDER”>  <Default Extension=“txt”ContentType=“plain/text” />  <Default Extension=“jpeg”ContentType=“image/jpeg” />  <Default Extension=“picture”ContentType=“image/gif” />  <Override PartName=“/a/b/sample4.picture”ContentType=“image/jpeg” /> </Types>

The following table shows a sample list of parts, and theircorresponding content types as defined by the above types stream: PartName Content Type /a/b/sample1.txt plain/text /a/b/sample2.jpegimage/jpeg /a/b/sample3.picture image/gif /a/b/sample4.pictureimage/jpeg

For every part in the package, the types stream contains either (a) onematching “Default” element, (b) one matching “Override” element, or (c)both a matching “Default” element and a matching “Override” element (inwhich case the “Override” element takes precedence). In general thereis, at most, one “Default” element for any given extension, and one“Override” element for any given part name.

The order of “Default” and “Override” elements in the types stream isnot significant. However, in interleaved packages, “Default” and“Override” elements appear in the physical package before the part(s)they correspond to.

Interleaving

Not all physical packages support interleaving of the data streams ofparts natively. In one embodiment, a mapping to any such physicalpackage uses the general mechanism described in this section to allowinterleaving of parts. The general mechanism works by breaking the datastream of a part into multiple pieces that can then be interleaved withpieces of other parts, or whole parts. The individual pieces of a partexist in the physical mapping and are not addressable in the logicalpackaging model. Pieces may have a zero size.

The following unique mapping from a part name to the names for theindividual pieces of a part is defined, such that a reader can stitchtogether the pieces in their original order to form the data stream ofthe part.

Grammar for deriving piece names for a given part name: piece_name =part_name “/” “[” 1*digit “]” [ “.last” ] “.piece”

The following validity constraints exist for piece_names generated bythe grammar:

-   -   The piece numbers start with 0, and are positive, consecutive        integer numbers. Piece numbers can be left-zero-padded.    -   The last piece of the set of pieces of a part contains the        “.last” in the piece name before “.piece”.    -   The piece name is generated from the name of the logical part        before mapping to names in the physical package.

Although it is not necessary to store pieces in their natural order,such storage may provide optimal efficiency. A physical packagecontaining interleaved (pieced) parts can also contain non-interleaved(one-piece) parts, so the following example would be valid:spine.xaml/[0].piece pages/page0.xaml spine.xaml/[1].piecepages/page1.xaml spine.xaml/[2].last.piece pages/page2.xaml

Specific Mappings

The following defines specific mappings for the following physicalformats: Loose files in a Windows file system.

Mapping to Loose Files in a Windows File System

In order to better understand how to map elements of the logical modelto a physical format, consider the basic case of representing a Metropackage as a collection of loose files in a Windows file system. Eachpart in the logical package will be contained in a separate file(stream). Each part name in the logical model corresponds to the name ofthe file. Logical Component Physical Representation Part File(s) Partname File name with path (which should look like URI, changes slash tobackslash, etc.). Part Content Type File containing XML expressingsimple list of file names and their associated types

The part names are translated into valid Windows file names, asillustrated by the table below.

Given below are two character sets that are valid for logical part namesegments (URI segments) and for Windows filenames. This table revealstwo important things:

-   -   There are two valid URI symbols colon (:) and asterisk (*) which        we need to escape when converting a URI to a filename.    -   There are valid filename symbols {circumflex over ( )} { } [ ] #        which cannot be present in a URI (they can be used for special        mapping purposes, like interleaving).

“Escaping” is used as a technique to produces valid filename characterswhen a part name contains a character that can not be used in a filename. To escape a character, the caret symbol ({circumflex over ( )}) isused, followed by the hexadecimal representation of the character.

To map from an abs_path (part name) to a file name: remove first /convert all / to \ escape colon and asterisk characters

For example, the part name /a:b/c/d*.xaml becomes the following filename a{circumflex over ( )}25b\c\d{circumflex over ( )}2a.xaml.

To perform the reverse mapping: convert all \ to / add / to thebeginning of the string unescape characters by replacing {circumflexover ( )}[hexCode] with the  corresponding character

From URI grammar rules Characters that are valid for naming files,(RFC2396) folders, or shortcuts   path_segments = segment *( “/” segment) Alphanum | {circumflex over ( )} Accent circumflex (caret)   segment =*pchar *( “;” param )  & Ampersand   param = *pchar  ' Apostrophe(single quotation mark)   pchar = unreserved | escaped |“:” | “@” | “&”|  @ At sign “=” | “+” | “$” | “,”  { Brace left   unreserved = alphanum| mark  } Brace right   alphanum = alpha | digit  [ Bracket opening  mark = “-” | “_” | “.” | “!” | “˜” | “*” | “'” |“(“ | ”)”  ] Bracketclosing   escaped = “%” hex hex  , Comma   hex = digit | “A” | “B” | “C”| “D” | “E” | “F” |“a” |  $ Dollar sign “b” | “c” | “d” | “e” | “f”  =Equal sign  ! Exclamation point  - Hyphen  # Number sign  ( Parenthesisopening  ) Parenthesis closing  % Percent  . Period  + Plus  ˜ Tilde _(—) Underscore

Versioning and Extensibility

Like other technical specifications, the specification contained hereinmay evolve with future enhancements. The design of the first edition ofthis specification includes plans for the future interchange ofdocuments between software systems written based on the first edition,and software systems written for future editions. Similarly, thisspecification allows for third-parties to create extensions to thespecification. Such an extension might, for example, allow for theconstruction of a document which exploits a feature of some specificprinter, while still retaining compatibility with other readers that areunaware of that printer's existence.

Documents using new versions of the Fixed Payload markup, or third-partyextensions to the markup, require readers to make appropriate decisionsabout behavior (e.g., how to render something visually). To guidereaders, the author of a document (or the tool that generated thedocument) should identify appropriate behavior for readers encounteringotherwise-unrecognized elements or attributes. For Reach documents, thistype of guidance is important.

New printers, browsers, and other clients may implement a variety ofsupport for future features. Document authors exploiting new versions orextensions must carefully consider the behavior of readers unaware ofthose versions of extensions.

Versioning Namespace

XML markup recognition is based on namespace URIs. For anyXML-namespace, a reader is expected to recognize either all or none ofthe XML-elements and XML-attributes defined in that namespace. If thereader does not recognize the new namespace, the reader will need toperform fallback rendering operations as specified within the document.

The XML namespace URI ‘http://PLACEHOLDER/version-control’ includes theXML elements and attributes used to construct Fixed payload markup thatis version-adaptive and extensions-adaptive. Fixed Payloads are notrequired to have versioning elements within them. In order to buildadaptive content, however, one must use at least one of the<ver:Compatibility.Rules> and <ver:AlternativeContent> XML-elements.

This Fixed-Payload markup specification has an xmlns URI associated withit: ‘http://PLACEHOLDER/pdl’. Using this namespace in a Fixed Payloadwill indicate to a reader application that only elements defined in thisspecification will be used. Future versions of this specification willhave their own namespaces. Reader applications familiar with the newnamespace will know how to support the superset of elements ofattributes defined in previous versions. Reader applications that arenot familiar with the new version will consider the URI of the newversion as if it were the URI of some unknown extension to the PDL.These applications may not know that a relationship exists between thenamespaces, that one is a superset of the other.

Backward and “Forward” Compatibility

In the context of applications or devices supporting the systems andmethods discussed herein, compatibility is indicated by the ability ofclients to parse and display documents that were authored using previousversions of the specification, or unknown extensions or versions of thespecification. Various versioning mechanisms address “backwardcompatibility,” allowing future implementations of clients to be able tosupport documents based on down-level versions of the specification, asillustrated below.

When an implemented client, such as a printer, receives a document builtusing a future version of the markup language, the client will be ableto parse and understand the available rendering options. The ability ofclient software written according to an older version of a specificationto handle some documents using features of a newer version is oftencalled “forward compatibility.” A document written to enable forwardcompatibility is described as “version-adaptive.”

Further, because implemented clients will also need to be able tosupport documents that have unknown extensions representing new elementsor properties, various semantics support the more general case ofdocuments that are “extension adaptive.”

If a printer or viewer encounters extensions that are unknown, it willlook for information embedded alongside the use of the extension forguidance about adaptively rendering the surrounding content. Thisadaptation involves replacing unknown elements or attributes withcontent that is understood. However, adaptation can take other forms,including purely ignoring unknown content. In the absence of explicitguidance, a reader should treat the presence of an unrecognizedextension in the markup as an error-condition. If guidance is notprovided, the extension is presumed to be fundamental to understandingthe content. The rendering failure will be captured and reported to theuser.

To support this model, new and extended versions of the markup languageshould logically group related extensions in namespaces. In this way,document authors will be able to take advantage of extended featuresusing a minimum number of namespaces.

Versioning Markup

The XML vocabulary for supporting extension-adaptive behavior includesthe following elements: Versioning Element and Hierarchy Description<Compatibility.Rules> Controls how the parser reacts to an unknownelement or attribute. <Ignorable> Declares that the associated namespaceURI is ignorable. <ProcessContent> Declares that if an element isignored, the contents of the element will be processed as if it wascontained by the container of the ignored element. <CarryAlong>Indicates to the document editing tools whether ignorable content shouldbe preserved when the document is modified. <MustUnderstand> Reversesthe effect of an element declared ignorable. <AlternateContent> Inmarkup that exploits versioning/extension features, the<AlternateContent> element associates substitute “fallback” markup to beused by reader applications that are not able to handle the markupspecified as Preferred. <Prefer> Specifies preferred content. Thiscontent will that a client is aware of version/extension features.<Fallback> For down-level clients, specifies the ‘down-level’ content tobe substituted for the preferred content.

The <Compatibility.Rules> Element

Compatibility.Rules can be attached to any element that can hold anattached attribute, as well as to the Xaml root element. The<Compatibility.Rules> element controls how the parser reacts to unknownelements or attributes. Normally such items are reported as errors.Adding an Ignorable element to a Compatibilitiy.Rules property informsthe compiler that items from certain namespaces can be ignored.

Compatibility.Rules can contain the elements Ignorable andMustUnderstand. By default, all elements and attributes are assumed tobe MustUnderstand. Elements and attributes can be made Ignorable byadding an Ignorable element into its container's Compatibility.Rulesproperty. An element or property can be made MustUnderstand again byadding a MustUnderstand element to one of the nested containers. OneIgnorable or MustUnderstand refers to a particular namespace URI withinthe same Compatibility.Rules element.

The <Compatibility.Rules> element affects the contents of a container,not the container's own tag or attributes. To affect a container's tagor attributes, its container must contain the compatibility rules. TheXaml root element can be used to specify compatibility rules forelements that would otherwise be root elements, such as Canvas. TheCompatibility.Rules compound attribute is the first element in acontainer.

The <Ignorable> Element

The <Ignorable> element declares that the enclosed namespace URI isignorable. An item can be considered ignorable if an <Ignorable> tag isdeclared ahead of the item in the current block or a container block,and the namespace URI is unknown to the parser. If the URI is known, theIgnorable tag is disregarded and all items are understood. In oneembodiment, all items not explicitly declared as Ignorable must beunderstood. The Ignorable element can contain <ProcessContent> and<CarryAlong> elements, which are used to modify how an element isignored as well as give guidance to document editing tools how suchcontent should be preserved in edited documents.

The <Process Content> Element

The <ProcessContent> element declares that if an element is ignored, thecontents of the element will be processed as if it was contained by thecontainer of the ignored element. <ProcessContent> Attributes AttributeDescription Elements A space delimited list of element names for whichto process the contents, or “*” indicating the contents of all elementsshould be processed. The Elements attribute defaults to “*” if it is notspecified.

The <CarryAlong> Element

The optional <CarryAlong> element indicates to the document editingtools whether ignorable content should be preserved when the document ismodified. The method by which an editing tool preserves or discards theignorable content is in the domain of the editing tool. If multiple<CarryAlong> elements refer to the same element or attribute in anamespace, the last <CarryAlong> specified has precedence. <CarryAlong>Attributes Attribute Description Elements A space delimited list ofelement names that are requested to be carried along when the documentis edited, or “*” indicating the contents of all elements in thenamespace should be carried along. The Elements attribute defaults to“*” if it is not specified. Attributes A space delimited list ofattribute names within the elements that are to be carried along, or a“*” indicating that all attributes of the elements should be carriedalong. When an element is ignored and carried along, all attributes arecarried along regardless of the contents of this attribute. Thisattribute only has an effect if the attribute specified is used in anelement that is not ignored, as in the example below. By default,Attributes is “*”.

The <MustUnderstand> Element

<MustUnderstand> is an element that reverses the effects of an Ignorableelement. This technique is useful, for example, when combined withalternate content. Outside the scope defined by the <MustUnderstand>element, the element remains Ignorable. <MustUnderstand> AttributesAttribute Description NamespaceUri The URI of the namespace whose itemsmust be understood.

The <AlternateContent> Element

The <AlternateContent> element allows alternate content to be providedif any part of the specified content is not understood. AnAlternateContent block uses both a <Prefer> and a <Fallback> block. Ifanything in the <Prefer> block is not understood, then the contents ofthe <Fallback> block are used. A namespace is declared <MustUnderstand>in order to indicate that the fallback is to be used. If a namespace isdeclared ignorable and that namespace is used within a <Prefer> block,the content in the <Fallback> block will not be used.

Versioning Markup Examples

Using <Ignorable>

This example uses a fictitious markup namespace,http://PLACEHOLDER/Circle, that defines an element Circle in its initialversion and uses the Opacity attribute of Circle introduced in a futureversion of the markup (version 2) and the Luminance property introducedin an even later version of the markup (version 3). This markup remainsloadable in versions 1 and 2, as well as 3 and beyond. Additionally, the<CarryAlong> element specifies that v3:Luminance MUST be preserved whenediting even when the editor doesn't understand v3:Luminance. For aversion 1 reader, Opacity and Luminance are ignored. For a version 2reader, only Luminance is ignored. For a version 3 reader and beyond,all the attributes are used. <FixedPanel   xmlns=“http://PLACEHOLDER/fixed-content”   xmlns:v=“http://PLACEHODER/versioned-content”   xmlns:v1=“http://PLACEHODER/Circle/v1”   xmlns:v2=“http://PLACEHODER/Circle/v2”   xmlns:v3=“http://PLACEHODER/Circle/v3” >  <v:Compatibility.Rules>  <v:Ignorable NamespaceUri=“ http://PLACEHODER/Circle/v2” />  <v:Ignorable NamespaceUri=“ http://PLACEHODER/Circle/v3” >    <v:CarryAlong Attributes=“Luminance” />   </v:Ignorable> </v:Compatibility.Rules>  <Canvas>   <Circle Center=“0,0” Radius=“20”Color=“Blue”     v2:Opacity=“0.5” v3:Luminance=“13” />   <CircleCenter=“25,0” Radius=“20” Color=“Black”     v2:Opacity=“0.5”v3:Luminance=“13” />   <Circle Center=“50,0” Radius=“20” Color=“Red”    v2:Opacity=“0.5” v3:Luminance=“13” />   <Circle Center=“13,20”Radius=“20” Color=“Yellow”     v2:Opacity=“0.5” v3:Luminance=“13” />  <Circle Center=“38,20” Radius=“20” Color=“Green”     v2:Opacity=“0.5”v3:Luminance=“13” />  </Canvas> </FixedPanel>

Using <MustUnderstand>

The following example demonstrates the use of the <MustUnderstand>element. <FixedPanel   xmlns=“http://PLACEHOLDER/fixed-content”  xmlns:v=“http://PLACEHODER/versioned-content”  xmlns:v1=“http://PLACEHODER/Circle/v1”  xmlns:v2=“http://PLACEHODER/Circle/v2”  xmlns:v3=“http://PLACEHODER/Circle/v3” >  <v:Compatibility.Rules>  <v:Ignorable NamespaceUri=“http://PLACEHODER/Circle/v2” />  <v:Ignorable NamespaceUri=“http://PLACEHODER/Circle/v3” >   <v:CarryAlong Attributes=“Luminance” />   </v:Ignorable> </v:Compatibility.Rules>  <Canvas>   <v:Compatibility.Rules>   <v:MustUnderstand NamespaceUri=“http://PLACEHODER/    Circle/v3” />  </v:Compatbility.Rules>   <Circle Center=“0,0” Radius=“20”Color=“Blue”    v2:Opacity=“0.5” v3:Luminance=“13” />   <CircleCenter=“25,0” Radius=“20” Color=“Black”    v2:Opacity=“0.5”v3:Luminance=“13” />   <Circle Center=“50,0” Radius=“20” Color=“Red”   v2:Opacity=“0.5” v3:Luminance=“13” />   <Circle Center=“13,20”Radius=“20” Color=“Yellow”    v2:Opacity=“0.5” v3:Luminance=“13” />  <Circle Center=“38,20” Radius=“20” Color=“Green”    v2:Opacity=“0.5”v3:Luminance=“13” />  </Canvas> </FixedPanel>

Use of the <MustUnderstand> element causes the references tov3:Luminance to be in error, even though it was declared to Ignorable inthe root element. This technique is useful if combined with alternatecontent that uses, for example, the Luminance property of Canvas addedin Version 2 instead (see below). Outside the scope of the Canvaselement, Circle's Luminance property is ignorable again. <FixedPanel  xmlns=“http://PLACEHOLDER/fixed-content”  xmlns:v=“http://PLACEHODER/versioned-content”  xmlns:v1=“http://PLACEHODER/Circle/v1”  xmlns:v2=“http://PLACEHODER/Circle/v2”  xmlns:v3=“http://PLACEHODER/Circle/v3” >  <v:Compatibility.Rules>  <v:Ignorable NamespaceUri=“http://PLACEHODER/Circle/v2” />  <v:Ignorable NamespaceUri=“http://PLACEHODER/Circle/v3” >   <v:CarryAlong Attributes=“Luminance” />   </v:Ignorable> </v:Compatibility.Rules>  <Canvas>   <v:Compatibility.Rules>   <v:MustUnderstand NamespaceUri=“http://PLACEHODER/    Circle/v3” />  </v:Compatbility.Rules>   <v:AlternateContent>    <v:Prefer>    <Circle Center=“0,0” Radius=“20” Color=“Blue”      v2:Opacity=“0.5”v3:Luminance=“13” />     <Circle Center=“25,0” Radius=“20” Color=“Black”     v2:Opacity=“0.5” v3:Luminance=“13” />     <Circle Center=“50,0”Radius=“20” Color=“Red”      v2:Opacity=“0.5” v3:Luminance=“13” />    <Circle Center=“13,20” Radius=“20” Color=“Yellow”     v2:Opacity=“0.5” v3:Luminance=“13” />     <Circle Center=“38,20”Radius=“20” Color=“Green”      v2:Opacity=“0.5” v3:Luminance=“13” />   </v:Prefer>    <v:Fallback>     <Canvas Luminance=“13”>      <CircleCenter=“0,0” Radius=“20” Color=“Blue”       v2:Opacity=“0.5” />     <Circle Center=“25,0” Radius=“20” Color=“Black”      v2:Opacity=“0.5” />      <Circle Center=“50,0” Radius=“20”Color=“Red”       v2:Opacity=“0.5” />      <Circle Center=“13,20”Radius=“20” Color=“Yellow”       v2:Opacity=“0.5” />      <CircleCenter=“38,20” Radius=“20” Color=“Green”       v2:Opacity=“0.5” />    </Canvas>    </v:Fallback>   </v:AlternateContent>  </Canvas></FixedPanel>

Using <AlternateContent>

If any element or attribute is declared as <MustUnderstand> but is notunderstood in the <Prefer> block of an <AlternateContent> block, the<Prefer> block is skipped in its entirety and the <Fallback> block isprocessed as normal (that is, any MustUnderstand items encountered arereported as errors). <v:AlternateContent>  <v:Prefer>   <Pathxmlns:m=“http://schemas.example.com/2008/metallic-finishes”   m:Finish=“GoldLeaf” ..... />  </v:Prefer>  <v:Fallback>   <PathFill=“Gold” ..... />  </v:Fallback> </v:AlternateContent>

The Reach Package Format

In the discussion that follows, a description of a specific file formatis provided. Separate primary sub-headings in this section include“Introduction to the Reach Package Format”, “The Reach PackageStructure”, “Fixed Payload Parts”, “FixedPage Markup Basics”,“Fixed-Payload Elements and Properties” and “FixedPage Markup”. Eachprimary sub-heading has one or more related sub-headings.

Introduction to the Reach Package Format

Having described an exemplary framework above, the description thatfollows is one of a specific format that is provided utilizing the toolsdescribed above. It is to be appreciated and understood that thefollowing description constitutes but one exemplary format and is notintended to limit application of the claimed subject matter.

In accordance with this embodiment, a single package may containmultiple payloads, each acting as a different representation of adocument. A payload is a collection of parts, including an identifiable“root” part and all the parts required for valid processing of that rootpart. For instance, a payload could be a fixed representation of adocument, a reflowable representation, or any arbitrary representation.

The description that follows defines a particular representation calledthe fixed payload. A fixed payload has a root part that contains aFixedPanel markup which, in turn, references FixedPage parts. Together,these describe a precise rendering of a multi-page document.

A package which holds at least one fixed payload, and follows otherrules described below, is known referred to as a reach package. Readersand writers of reach packages can implement their own parsers andrendering engines, based on the specification of the reach packageformat.

Features of Reach Packages

In accordance with the described embodiment, reach packages address therequirements that information workers have for distributing, archiving,and rendering documents. Using known rendering rules, reach packages canbe unambiguously and exactly reproduced or printed from the format inwhich they are saved, without tying client devices or applications tospecific operating systems or service libraries. Additionally, becausethe reach payload is expressed in a neutral, application-independentway, the document can typically be viewed and printed without theapplication used to create the package. To provide this ability, thenotion of a fixed payload is introduced and contained in a reachpackage.

In accordance with the described embodiment, a fixed payload has a fixednumber of pages and page breaks are always the same. The layout of allthe elements on a page in a fixed payload is predetermined. Each pagehas a fixed size and orientation. As such, no layout calculations haveto be performed on the consuming side and content can simply berendered. This applies not just to graphics, but to text as well, whichis represented in the fixed payload with precise typographic placement.The content of a page (text, graphics, images) is described using apowerful but simple set of visual primitives.

Reach packages support a variety of mechanisms for organizing pages. Agroup of pages are “glued” together one after another into a“FixedPanel.” This group of pages is roughly equivalent to a traditionalmulti-page document. A FixedPanel can then further participate incomposition—the process of building sequences and selections to assemblea “compound” document.

In the illustrated and described embodiment, reach packages support aspecific kind of sequence called a FixedPanel sequence that can be used,for example, to glue together a set of FixedPanels into a single, larger“document.” Imagine, for example, gluing together two documents thatcame from different sources: a two-page cover memo (a FixedPanel) and atwenty-page report (a FixedPanel).

Reach packages support a number of specific selectors that can be usedwhen building document packages containing alternate representations ofthe “same” content. In particular, reach packages allow selection basedon language, color capability, and page size. Thus, one could have, forexample, a bi-lingual document that uses a selector to pick between theEnglish representation and the French representation of the document.

In addition to these simple uses of selector and sequence forcomposition in a reach package, it is important to note that selectorsand sequences can also refer to further selectors and sequences thusallowing for powerful aggregate hierarchies to be built. The exact rulesfor what can and cannot be done, in accordance with this embodiment, arespecified below in the section entitled “The Reach Package Structure”.

Additionally, a reach package can contain additional payloads that arenot fixed payloads, but instead are richer and perhaps editablerepresentations of the document. This allows a package to contain arich, editable document that works well in an editor application as wellas a representation that is visually accurate and can be viewed withoutthe editing application.

Finally, in accordance with this embodiment, reach packages support whatis known as a print ticket. The print ticket provides settings thatshould be used when the package is printed. These print tickets can beattached in a variety of ways to achieve substantial flexibility. Forexample, a print ticket can be “attached” to an entire package and itssettings will affect the whole package. Print tickets can be furtherattached at lower levels in the structure (e.g., to individual pages)and these print tickets will provide override settings to be used whenprinting the part to which they are attached.

The Reach Package Structure

As described above, a reach package supports a set of features including“fixed” pages, FixedPanels, composition, print tickets, and the like.These features are represented in a package using the core components ofthe package model: parts and relationships. In this section and itsrelated sub-sections, a complete definition of a “reach package” isprovided, including descriptions of how all these parts andrelationships must be assembled, related, etc.

Reach Package Structure Overview

FIG. 10 illustrates an exemplary reach package and, in this embodiment,each of the valid types of parts that can make up or be found in apackage. The table provided just below lists each valid part type andprovides a description of each: FixedPage Each FixedPage part representsthe content of a page application/xml+FixedPage-PLACEHOLDER FixedPanelEach FixedPanel glues together a set of FixedPages inapplication/xml+FixedPanel-PLACEHOLDER order Font Fonts can be embeddedin a package to ensure reliable reproduction of the document's glyphs.Image Image parts can be included image/jpeg image/png Composition PartsSelectors and sequences can be used to build aapplication/xml+Selector+[XXX] “composition” block, introducinghigher-level organization Application/xml+Sequence+[XXX] to the package.Descriptive Metadata Descriptive metadata (e.g., title, keywords) can beapplication/xml+SimpleTypeProperties-PLACEHOLDER included for thedocument. Print Ticket A print ticket can be included to providesettings to be application/xml+PRINTTICKET-PLACEHOLDER used whenprinting the package.

Because a reach package is designed to be a “view and print anywhere”document, readers and writers of reach packages must share common,unambiguously-defined expectations of what constitutes a “valid” reachpackage. To provide a definition of a “valid” reach package, a fewconcepts are first defined below.

Reach Composition Parts

A reach package must contain at least one FixedPanel that is“discoverable” by traversing the composition block from the startingpart of the package. In accordance with the described embodiment, thediscovery process follows the following algorithm:

-   -   Recursively traverse the graph of composition parts starting at        the package starting part.    -   When performing this traversal, only traverse into composition        parts that are reach composition parts (described below).    -   Locate all of the terminal nodes (those without outgoing arcs)        at the edge of the graph.

These terminal nodes refer (via their <item> elements) to a set of partscalled the reach payload roots.

Fixed Payload

A fixed payload is a payload whose root part is a FixedPanel part. Forexample, each of the fixed payloads in FIG. 10 has as its root part anassociated FixedPanel part. The payload includes the full closure of allof the parts required for valid processing of the FixedPanel. Theseinclude:

-   -   The FixedPanel itself;    -   All FixedPages referenced from within the FixedPanel;    -   All image parts referenced (directly, or indirectly through a        selector) by any of the FixedPages in the payload;    -   All reach selectors (as described below) referenced directly or        indirectly from image brushes used within any of the FixedPages        within the payload;    -   All font parts referenced by any of the FixedPages in the        payload;    -   All descriptive metadata parts attached to any part in the fixed        payload; and    -   Any print tickets attached to any part in the fixed payload.

Validity Rules for Reach Package

With the above definitions in place, conformance rules that describe a“valid” reach package in accordance with the described embodiment arenow described:

-   -   A reach package must have a starting part defined using the        standard mechanism of a package relationship as described above;    -   The starting part of a reach package must be either a selector        or a sequence;    -   A reach package must have at least one reach payload root that        is a FixedPanel;    -   PrintTicket parts may be attached to any of the composition        parts, FixedPanel parts or any of the FixedPage parts identified        in the FixedPanel(s). In the present example, this is done via        the http://PLACEHOLDER/HasPrintTicketRel relationship;        -   PrintTickets may be attached to any or all of these parts;        -   Any given part must have no more than one PrintTicket            attached;    -   A Descriptive Metadata part may be attached to any part in the        package;    -   Every Font object in the FixedPayload must meet the font format        rules defined in section “Font Parts”.    -   References to images from within any FixedPage in the fixed        payload may point to a selector which may make a selection        (potentially recursively through other selectors) to find the        actual image part to be rendered;    -   Every Image object used in the fixed payload must meet the font        format rules defined in section “Image Parts”;    -   For any font, image or selector part referenced from a FixedPage        (directly, or indirectly through selector), there must be a        “required part” relationship (relationship        name=http://mmcf-fixed-RequiredResource-PLACEHOLDER) from the        referencing FixedPage to the referenced part.

Reach Composition Parts

While a reach package may contain many types of composition part, only awell-defined set of types of composition parts have well-definedbehavior according to this document. These composition parts withwell-defined behavior are called reach composition parts. Parts otherthan these are not relevant when determining validity of a reachpackage.

The following types of composition parts are defined as reachcomposition parts: Language Selector Chooses between representationsapplication/xml+selector+language based on their natural language ColorSelector Chooses between representations application/xml+selector+colorbased on whether they are monochromatic or color Page Size SelectorChooses between representations application/xml+selector+pagesize basedon their page size Content Type Selector Chooses between representationsapplication/xml+selector+contenttype based on whether their contenttypes can be understood by the system Fixed Sequence Combines childrenthat are fixed application/xml+sequence+fixed content into a sequence

Reach Selectors

Those selector composition parts defined as reach composition parts arecalled reach selectors. As noted above, a language selector picksbetween representations based on their natural language, such as Englishor French. To discover this language, the selector inspects each of itsitems. Only those that are XML are considered. For those, the rootelement of each one is inspected to determine its language. If thexml:lang attribute is not present, the part is ignored. The selectorthen considers each of these parts in turn, selecting the first onewhose language matches the system's default language.

A color selector chooses between representations based on whether theyare monochromatic or color. The page size selector chooses betweenrepresentations based on their page size. A content type selectorchooses between representations based on whether their content types canbe understood by the system.

Reach Sequences

Those sequence composition parts defined as reach composition parts arecalled reach sequences. A fixed sequence combines children that arefixed content into a sequence.

Fixed Payloads Parts

The fixed payload can contain the following kinds of parts: a FixedPanelpart, a FixedPage part, Image parts, Font parts, Print Ticket parts, andDescriptive Metadata parts, each of which is discussed below under itsown sub-heading.

The FixedPanel Part

The document structure of the Fixed-Payload identifies FixedPages aspart of a spine, as shown below. The relationships between the spinepart and the page parts are defined within the relationships stream forthe spine. The FixedPanel part is of content typeapplication/xml+PLACEHOLDER.

The spine of the Fixed-Payload content is specified in markup byincluding a <FixedPanel> element within a <Document> element. In theexample below, the <FixedPanel> element specifies the sources of thepages that are held in the spine. <!-- SPINE --> <Document$XMLNSFIXED$ >   <FixedPanel>     <PageContent Source=”p1.xml” />    <PageContent Source=”p2.xml” />   </FixedPanel> </Document>

The <Document> Element

The <Document> element has no attributes and must have only one child:<FixedPanel>.

The <FixedPanel> Element

The <FixedPanel> element is the document spine, logically binding anordered sequence of pages together into a single multi-page document.Pages always specify their own width and height, but a <FixedPanel>element may also optionally specify a height and width. This informationcan be used for a variety of purposes including, for example, selectingbetween alternate representations based on page size. If a <FixedPanel>element specifies a height and width, it will usually be aligned withthe width and height of the pages within the <FixedPanel>, but thesedimensions do not specify the height and width of individual pages.

The following table summarizes FixedPanel attributes in accordance withthe described embodiment. <FixedPanel> Attribute Description PageHeightTypical height of pages contained in the <FixedPanel>. OptionalPageWidth Typical width of pages contained in the <FixedPanel>. Optional

The <PageContent> element is the only allowable child element of the<FixedPanel> element. The <PageContent> elements are in sequentialmarkup order matching the page order of the document.

The <PageContent> Element

Each <PageContent> element refers to the source of the content for asingle page. To determine the number of pages in the document, one wouldcount the number of <PageContent> children contained within the<FixedPanel>.

The <PageContent> element has no allowable children, and has a singlerequired attribute, Source, which refers to the FixedPage part for thecontents of a page.

As with the <FixedPanel> element, the <PageContent> element mayoptionally include a PageHeight and PageWidth attribute, here reflectingthe size of the single page. The required page size is specified in theFixedPage part; the optional size on <PageContent> is advisory only. The<PageContent> size attributes allow applications such as documentviewers to make visual layout estimates for a document quickly, withoutloading and parsing all of the individual FixedPage parts.

The table provided just below summarizes <PageContent> attributes andprovides a description of the attributes. <PageContent> AttributeDescription Source A URI string that refers to the page content, held ina distinct part within the package. The content is identified as a partwithin the package. Required. PageHeight Optional PageWidth Optional

The URI string of the page content must reference the part location ofthe content relative to the package.

The FixedPage Part

Each <PageContent> element in the <FixedPanel> references by name (URI)a FixedPage part. Each FixedPage part contains FixedPage markupdescribing the rendering of a single page of content. The FixedPage partis of Content Type application/xml+PLACEHOLDER−FixedPage.

Describing FixedPages in Markup

Below is an example of how the markup of the source content might lookfor the page referenced in the sample spine markup above (<PageContentSource=“p1.xml”/>). //  /content/p1.xml <FixedPage PageHeight=”1056”PageWidth=”816”>   <Glyphs     OriginX = “96”     OriginY = “96”    UnicodeString = “This is Page 1!”     FontUri = “../Fonts/Times.TTF”    FontRenderingEmsize = “16”   /> </FixedPage>

The table below summarizes FixedPage properties and provides adescription of the properties. FixedPage Property Description PageHeightRequired PageWidth Required

Reading Order in FixedPage Markup

In one embodiment, the markup order of the Glyphs child elementscontained within a FixedPage must be the same as the desired readingorder of the text content of the page. This reading order may be usedboth for interactive selection/copy of sequential text from a FixedPagein a viewer, and for enabling access to sequential text by accessibilitytechnology. It is the responsibility of the application generating theFixedPage markup to ensure this correspondence between markup order andreading order.

Image Parts

Supported Formats

In accordance with the described embodiment, image parts used byFixedPages in a reach package can be in a fixed number of formats, e.g.,PNG or JPEG, although other formats can be used.

Font Parts

In accordance with the described embodiment, reach packages support alimited number of font formats. In the illustrated and describedembodiment, the supported font format include the TrueType format andthe OpenType format.

As will be appreciated by the skilled artisan, the OpenType font formatis an extension of the TrueType font format, adding support forPostScript font data and complex typographical layout. An OpenType fontfile contains data, in table format, that comprises either a TrueTypeoutline font or a PostScript outline font.

In accordance with the described embodiment, the following font formatsare not supported in reach packages: Adobe type 1, Bitmap font, Fontwith hidden attribute (use system Flag to decide whether to enumerate itor not), Vector fonts, and EUDC font (whose font family name is EUDC).

Subsetting Fonts

Fixed payloads represent all text using the Glyphs element described indetail below. Since, in this embodiment, the format is fixed, it ispossible to subset fonts to contain only the glyphs required byFixedPayloads. Therefore, fonts in reach packages may be subsetted basedon glyph usage. Though a subsetted font will not contain all the glyphsin the original font, the subsetted font must be a valid OpenType fontfile.

Print Ticket Parts

Print ticket parts provide settings that can be used when the package isprinted. These print tickets can be attached in a variety of ways toachieve substantial flexibility. For example, a print ticket can be“attached” to an entire package and its settings will affect the wholepackage. Print tickets can be further attached at lower levels in thestructure (e.g., to individual pages) and these print tickets willprovide override settings to be used when printing the part to whichthey are attached.

Descriptive Metadata

As noted above, descriptive metadata parts provide writers or producersof packages with a way in which to store values of properties thatenable readers of the packages to reliably discover the values. Theseproperties are typically used to record additional information about thepackage as a whole, as well as individual parts within the container.

FixedPage Markup Basics

This section describes some basic information associated with theFixedPage markup and includes the following sections: “Fixed Payload andOther Markup Standards”, “FixedPage Markup Model”, “Resources andResource References”, and “FixedPage Drawing Model”.

Fixed Payload and Other Markup Standards

The FixedPanel and FixedPage markup for the Fixed Payload in a reachpackage is a subset from Windows® Longhorn's Avalon XAML markup. Thatis, while the Fixed Payload markup stands alone as an independent XMLmarkup format (as documented in this document), it loads in the same wayas in Longhorn systems, and renders a WYSIWYG reproduction of theoriginal multi-page document.

As some background on XAML markup, consider the following. XAML markupis a mechanism that allows a user to specify a hierarchy of objects andthe programming logic behind the objects as an XML-based markuplanguage. This I provides the ability for an object model to bedescribed in XML. This allows extensible classes, such as classes in theCommon Language Runtime (CLR) of the .NET Framework by MicrosoftCorporation, to be accessed in XML. The XAML mechanism provides a directmapping of XML tags to CLR objects and the ability to represent relatedcode in the markup. It is to be appreciated and understood that variousimplementations need not specifically utilize a CLR-based implementationof XAML. Rather, a CLR-based implementation constitutes but one way inwhich XAML can be employed in the context of the embodiments describedin this document.

More specifically, consider the following in connection with FIG. 11which illustrates an exemplary mapping of CLR concepts (left sidecomponents) to XML (right side components). Namespaces are found in thexmlns declaration using a CLR concept called reflection. Classes mapdirectly to XML tags. Properties and events map directly to attributes.Using this hierarchy, a user can specify a hierarchy tree of any CLRobjects in XML markup files. Xaml files are xml files with a .xamlextension and a mediatype of application/xaml+xml. Xaml files have oneroot tag that typically specifies a namespace using the xmlns attribute.The namespace may be specified in other types of tags.

Continuing, tags in a xaml file generally map to CLR objects. Tags canbe elements, compound properties, definitions or resources. Elements areCLR objects that are generally instantiated during runtime and form ahierarchy of objects. Compound property tags are used to set a propertyin a parent tag. Definition tags are used to add code into a page anddefine resources. The resource tag provides the ability to reuse a treeof objects merely by specifying the tree as a resource. Definition tagsmay also be defined within another tag as an xmlns attribute.

Once a document is suitably described in markup (typically by a writer),the markup can be parsed and processed (typically by a reader). Asuitably configured parser determines from the root tag which CLRassemblies and namespaces should be searched to find a tag. In manyinstances, the parser looks for and will find a namespace definitionfile in a URL specified by the xmlns attribute. The namespace definitionfile provides the name of assemblies and their install path and a listof CLR namespaces. When the parser encounters a tag, the parserdetermines which CLR class the tag refers to using the xmlns of the tagand the xmlns definition file for that xmlns. The parser searches in theorder that the assemblies and namespaces are specified in the definitionfile. When it finds a match, the parser instantiates an object of theclass.

Thus, the mechanism described just above, and more fully in theapplication incorporated by reference above, allows object models to berepresented in an XML-based file using markup tags. This ability torepresent object models as markup tags can be used to create vectorgraphic drawings, fixed-format documents, adaptive-flow documents, andapplication UIs asynchronously or synchronously.

In the illustrated and described embodiment, the Fixed Payload markup isa very minimal, nearly completely parsimonious subset of Avalon XAMLrendering primitives. It represents visually anything that can berepresented in Avalon, with full fidelity. The Fixed Payload markup is asubset of Avalon XAML elements and properties—plus additionalconventions, canonical forms, or restrictions in usage compared toAvalon XAML.

The radically-minimal Fixed Payload markup set defined reduces the costassociated with implementation and testing of reach package readers,such as printer RIPs or interactive viewer applications—as well asreducing the complexity and memory footprint of the associated parser.The parsimonious markup set also minimizes the opportunities forsubsetting, errors, or inconsistencies among reach package writers andreaders, making the format and its ecosystem inherently more robust.

In addition to the minimal Fixed Payload markup, the reach package willspecify markup for additional semantic information to support viewers orpresentations of reach package documents with features such ashyperlinks, section/outline structure and navigation, text selection,and document accessibility.

Finally, using the versioning and extensibility mechanisms describedabove, it is possible to supplement the minimal Fixed Payload markupwith a richer set of elements for specific target consumingapplications, viewers, or devices.

FixedPage Markup Model

In the illustrated and described embodiment, a FixedPage part isexpressed in an XML-based markup language, based on XML-Elements,XML-Attributes, and XML-Namespaces. Three XML-Namespaces are defined inthis document for inclusion in FixedPage markup. One such namespacereferences the Version-control elements and attributes defined elsewherein this specification. The principle namespace used for elements andattributes in the FixedPage markup is“http://schemas.microsoft.com/MMCF-PLACEHOLDER-FixedPage”. And finally,FixedPage markup introduces a concept of “Resources” which requires athird namespace, described below.

Although FixedPage markup is expressed using XML-Elements andXML-Attributes, its specification is based upon a higher-level abstractmodel of “Contents” and “Properties”. The FixedPage elements are allexpressed as XML-elements. Only a handful of FixedPage elements can hold“Contents”, expressed as child XML-elements. But a property-value may beexpressed using an XML-Attribute or using a child XML-element.

FixedPage Markup also depends upon the twin concepts of aResource-Dictionary and Resource-Reference. The combination of aResource-Dictionary and multiple Resource-References allows for a singleproperty-value to be shared by multiple properties of multipleFixedPage-markup elements.

Properties in FixedPage Markup

In the illustrated and described embodiment, there are three forms ofmarkup which can be used to specify the value of a FixedPage-markupproperty.

If the property is specified using a resource-reference, then theproperty name is used as an XML-attribute name, and a special syntax forthe attribute-value indicates the presence of a resource reference. Thesyntax for expressing resource-references is described in the sectionentitled “Resources and Resource-References”.

Any property-value that is not specified as a resource-reference may beexpressed in XML using a nested child XML-element identifying theproperty whose value is being set. This “Compound-Property Syntax” isdescribed below.

Finally, some non-resource-reference property-values can be expressed assimple-text strings. Although all such property-values may be expressedusing Compound-Property Syntax, they may also be expressed using simpleXML-attribute syntax

For any given element, any property may be set no more than once,regardless of the syntax used for specifying a value.

Simple Attribute Syntax

For a property value expressible as a simple string,XML-attribute-syntax may be used to specify a property-value. Forexample, given the FixedPage-markup element called “SolidColorBrush,”with the property called “Color”, the following syntax can be used tospecify a property value: <!-- Simple Attribute Syntax --><SolidColorBrush Color=“#FF0000” />

Compound-Property Syntax

Some property values cannot be expressed as a simple string, e.g. anXML-element is used to describe the property value. Such a propertyvalue cannot be expressed using simple attribute syntax. But they can beexpressed using compound-property syntax.

In compound-property syntax, a child XML-Element is used, but theXML-Element name is derived from a combination of the parent-elementname and the property name, separated by dot. Given the FixedPage-markupelement <Path>, which has a property “Fill” which may be set to a<SolidColorBrush>, the following markup can be used to set the “Fill”property of the <Path> element: <!-- Compound-Property Syntax --> <Path>  <Path.Fill>     <SolidColorBrush Color=“#FF0000” />   </Path.Fill>  ... </Path>

Compound-Property Syntax may be used even in cases whereSimple-Attribute Syntax would suffice to express a property-value. So,the example of the previous section: <!-- Simple Attribute Syntax --><SolidColorBrush Color=“#FF0000” />

Can be expressed instead in Compound-Property Syntax: <!--Compound-Property Syntax --> <SolidColorBrush>  <SolidColorBrush.Color>#FF0000</SolidColorBrush.Color></SolidColorBrush>

When specifying property-value using Compound-Property Syntax, the childXML-elements representing “Properties” must appear before childXML-elements representing “Contents”. The order of individualCompound-Property child XML-elements is not important, only that theyappear together before any “Contents” of the parent-element.

For example, when using both Clip and RenderTransform properties of the<Canvas> element (described below), both must appear before any <Path>and <Glyphs> Contents of the <Canvas>: <Canvas>   <!-- First, theproperty-related child elements -->   <Canvas.RenderTransform>    <MatrixTransform Matrix=“1,0,0,1,0,0”>   </Canvas.RenderTransform>  <Canvas.Clip>     <PathGeometry>       ...     </PathGeometry>  </Canvas.Clip>   <!-- Then, the “Contents” -->   <Path ...>     ...  </Path>   <Glyphs ...>     ...   </Glyphs> </Canvas>

Resources and Resource References

Resource Dictionaries can be used to hold shareable property values,each called a resource. Any property value which is itself aFixedPage-markup element may be held in a Resource Dictionary. Eachresource in a Resource Dictionary carries a name. The resource's namecan be used to reference the resource from a property's XML-attribute.

In the illustrated and described embodiment, the <Canvas> and<FixedPage> elements can carry a Resource Dictionary. A ResourceDictionary is expressed in markup as a property of the <Canvas> and<FixedPage> elements in a property called “Resources”. However,individual resource-values are embedded directly within the<FixedPage.Resources> or <Canvas.Resources> XML-element. Syntactically,the markup for <Canvas.Resources> and <FixedPage.Resource> resemblesthat for markup elements with “Contents”.

In accordance with this embodiment, <Canvas.Resources> or<FixedPage.Resources> must precede any compound-property-syntax propertyvalues of the <Canvas> or <FixedPage>. They similarly must precede any“Contents” of the <Canvas> or <FixedPage>.

Defining Fixed-Payload Resource Dictionaries

Any <FixedPage> or <Canvas> can carry a Resource Dictionary, expressedusing the <Canvas.Resources> XML-element. Each element within a singleresource dictionary is given a unique name, identified by using anXML-attribute associated with the element. To distinguish this “Name”attribute from those attributes corresponding to properties, the Nameattribute is taken from a namespace other than that of the FixedFormatelements. The URI for that XML-namespace is“http://schemas.microsoft.com/PLACEHOLDER-for-resources”. In the examplebelow, two geometries are defined: one for a rectangle and the other fora circle. <Canvasxmlns:def=“http://schemas.microsoft.com/PLACEHOLDER-for- resources”>  <Canvas.Resources>     <PathGeometry def:Name=”Rectangle”>      <PathFigure>         ...       </PathFigure>     </PathGeometry>    <PathGeometry def:Name=”Circle”>       <PathFigure>         ...      </PathFigure>     </PathGeometry>   </Canvas.Resources> </Canvas>

Referencing Resources

To set a property value to one of the resources defined above, use anXML-attribute value which encloses the resource name in { }. Forexample, “{Rectangle}” will denote the geometry to be used. In themarkup sample below, the rectangular region defined by the geometryobjects in the dictionary will be filled by the SolidColorBrush.<Canvas>   <Canvas.Resources>     <PathGeometry def:Name=”Rectangle”>      ...     </PathGeometry>   </Canvas.Resources>   <Path>    <Path.Data>       <PathGeometry PathGeometry=”{Rectangle}” />    </Path.Data>     <Path.Fill>       <SolidColorBrush Color=“#FF0000”/>     </Path.Fill>   </Path> </Canvas>

In accordance with this embodiment, a resource reference must not occurwithin the definition of a resource in a Resource Dictionary.

Scoping Rules for Resolving Resource References

Although a single Name may not be used twice in the same ResourceDictionary, the same name may be used in two different ResourceDictionaries within a single FixedPage part. Furthermore, the ResourceDictionary of an inner <Canvas> may re-use a Name defined in theResource Dictionary of some outer <Canvas> or <FixedPage>.

When a resource-reference is used to set a property of an element,various Resource Dictionaries are searched for a resource of the givenname. If the element bearing the property is a <Canvas>, then theResource Dictionary (if present) of that <Canvas> is searched for aresource of the desired name. If the element is not a <Canvas> thensearch begins with the nearest containing <Canvas> or <FixedPage>. Ifthe desired name is not defined in the initially searched ResourceDictionary, then the next-nearest containing <Canvas> or <FixedPage> isconsulted. An error occurs if the search continued to the root<FixedPage> element, and a resource of the desired name is not found ina Resource Dictionary associated with that <FixedPage>.

The example below demonstrates these rules. <FixedPagexmlns:def=“http://schemas.microsoft.com/PLACEHOLDER- for-resources”  PageHeight=“1056” PageWidth=“816”>   <FixedPage.Resources>     <Filldef:Name=”FavoriteColorFill”>       <SolidColorBrush Color=”#808080” />    </Fill>   </FixedPage.Resources>   <Canvas>     <Canvas.Resources>      <Fill def:Name=”FavoriteColorFill”>         <SolidColorBrushColor=”#000000” />       </Fill>     </Canvas.Resources>     <!-- Thefollowing Path will be filed with color #000000 -->     <PathFill=“{FavoriteColorFill}”>       <Path.Data>         ...      </Path.Data>     </Path>     <Canvas>       <!-- The followingPath will be filed with       color #000000 -->       <PathFill=“{FavoriteColorFill}”>         <Path.Data>         ...        </Path.Data>       </Path>     </Canvas>   </Canvas>   <-- Thefollowing path will be filled with color #808080 -->   <PathFill=“{FavoriteColorFill}”>     <Path.Data>       ...     </Path.Data>  </Path> </FixedPage>

FixedPage Drawing Model

The FixedPage (or a nested Canvas child) element is the element on whichother elements are rendered. The arrangement of content is controlled byproperties specified for the FixedPage (or Canvas), the propertiesspecified for elements on the FixedPage (or Canvas), and bycompositional rules defined for the Fixed-Payload namespace.

Using Canvas to Position Elements

In fixed markup, all elements are positioned relative to the currentorigin (0,0) of the coordinate system. The current origin can be movedby applying the RenderTransform attribute to each element of theFixedPage or Canvas that contains an element.

The following example illustrates positioning of elements throughRenderTransform. <Canvas>   <Canvas.Resources>     <PathGeometrydef:Name=”StarFish”>       <!-- Various PathFigures in here -->      ...     </PathGeometry>     <PathGeometry def:Name=”LogoShape”>      <!-- Various PathFigures in here -->       ...     </PathGeometry>  </Canvas.Resources>   <!-- Draw a green StarFish and a red LogoShapeshifted by 100 to the right and 50 down -->   <Canvas>    <Canvas.RenderTransform>       <MatrixTransformMatrix=”1,0,0,1,100,50”/>     </Canvas.RenderTransform>     <PathFill=”#00FF00” Data=”{StarFish}”/>     <Path Fill=”#FF0000”Data=”{LogoShape}”/>   </Canvas>   <!-- Draw a green StarFish and a redLogoShape shifted by 200 to the right and 250 down -->   <Canvas>    <Canvas.RenderTransform>       <MatrixTransformMatrix=”1,0,0,1,200,250”/>     </Canvas.RenderTransform>     <PathFill=”#00FF00” Data=”{StarFish}”/>     <Path Fill=”#FF0000”Data=”{LogoShape}”/>   </Canvas> </Canvas>

Coordinate Systems and Units

In accordance with the illustrated and described embodiment, thecoordinate system is initially set up so that one unit in thatcoordinate system is equal to 1/96^(th) of an inch, expressed as afloating point value, the origin (0,0) of the coordinate system is theleft top corner of the FixedPage element.

A RenderTransform attribute can be specified on any child element toapply an affine transform to the current coordinate system.

Page Dimensions

The page dimensions are specified by the “PageWidth” and “PageHeight”parameters on the FixedPage element.

Composition Rules

FixedPages use the painter's model with alpha channel. In accordancewith the described embodiment, composition must occur according to theserules, and in the following order:

-   -   The FixedPage (or any nested Canvas) is thought of as a        unbounded surface to which child elements are drawn in the order        they appear in the markup. The alpha channel of this surface is        initialized to “0.0” (all transparent). In practice the ideal        unbounded surface can be thought of as a bitmap buffer large        enough to hold all marks produced by rendering all the child        elements.    -   The contents of the surface are transformed using the affine        transform specified by the RenderTransform property of the        FixedPage (or Canvas).    -   All child elements are rendered onto the surface, clipped by the        Clip property (which is also transformed using the        RenderTransform property) of the FixedPage (or Canvas). The        FixedPage additionally clips to the rectangle specified by        (0,0,PageWidth,PageHeight). If a child element has an Opacity        property or OpacityMask property, it is applied to the child        element before it is rendered onto the surface.    -   Finally, the contents of the FixedPage (or Canvas) are rendered        onto its containing element. In the case of FixedPage, the        containing element is the physical imaging surface.

Rendering occurs according to these rules:

-   -   The only elements that produce marks on a surface are “Glyphs”        and “Path”.    -   All other rendering effects can be achieved by positioning        “Glyphs” and “Path” elements onto a “Canvas”, and applying their        various valid attributes.

Fixed-Payload Elements and Properties

The Fixed Payload, in accordance with the illustrated and describedembodiment, includes a small set of XML elements used in markup torepresent pages and their contents. The markup in a FixedPanel partbrings the pages of a document together to a common, easily-indexedroot, using <Document>, <FixedPanel>, and <PageContent> elements. EachFixedPage part represents a page's contents in a <FixedPage> elementwith only <Path> and <Glyphs> elements (which together do all of thedrawing), and the <Canvas> element to group them.

The Fixed-Payload markup's element hierarchy is summarized in followingsections entitled “Top-level elements”, “Geometry for Path, Clip”,“Brushes used to fill a Path, Glyphs, or OpacityMask”, “Resourcedictionaries for FixedPage or Canvas”, “Opacity masks for alphatransparency”, “Clipping paths” and “Transforms”. Top-level elements<Document> [exactly one per FixedPanel part]   Attributes:     [none]  Child Elements:     <FixedPanel> [exactly one] <FixedPanel>  Attributes:     PageHeight [optional]     PageWidth [optional]   ChildElements:     <PageContent> [1-N of these child elements] <PageContent>  Attributes:     Source [required]     PageHeight [optional]    PageWidth [optional]   Child Elements:     [none] <FixedPage>  Properties expressed via simple XML attributes directly:    PageHeight [required (here or as child element)]     PageWidth[required (here or as child element)]   Resource dictionary itselfexpressed as an XML child element:     <FixedPage.Resources>  Properties expressed via XML child elements     <FixedPage.PageHeight>[required (here or as attribute)]     <FixedPage.PageWidth> [required(here or as attribute)]   Content via XML child Elements:     <Canvas>    <Path>     <Glyphs> <Canvas>   Properties expressed via simple XMLattributes directly:     Opacity   Properties expressed via resourcedictionary reference:     Clip     RenderTransform     OpacityMask  Resource dictionary itself expressed as an XML child element:    <Canvas.Resources>   Properties expressed via XML child elements    <Canvas.Opacity>     <Canvas.Clip>     <Canvas.RenderTransform>    <Canvas.OpacityMask>   Content via XML child Elements:     <Canvas>    <Path>     <Glyphs> <Path>   Properties expressed via simple XMLattributes directly:     Opacity   Properties expressed via resourcedictionary reference:     Clip     RenderTransform     OpacityMask    Fill   Properties expressed via XML child elements    <Path.Opacity>     <Path.Clip>     <Path.RenderTransform>    <Path.OpacityMask>     <Path.Fill>     <Path.Data>     <Glyphs>  Properties expressed via simple XML attributes directly:     Opacity    BidiLevel     FontFaceIndex     FontHintingEmSize    FontRenderingEmSize     FontUri     Indices     OriginX     OriginY    Sideways     StyleSimulations     UnicodeString   Propertiesexpressed via resource dictionary reference:     Clip    RenderTransform     OpacityMask     Fill   Properties expressed viaXML child elements     <Glyphs.Clip>     <Glyphs.RenderTransform>    <Glyphs.OpacityMask>     <Glyphs.Fill>     <Glyphs.Opacity>    <Glyphs.BidiLevel>     <Glyphs.FontFaceIndex>    <Glyphs.FontHintingEmSize>     <Glyphs.FontRenderingEmSize>    <Glyphs.FontUri>     <Glyphs.Indices>     <Glyphs.OriginX>    <Glyphs.OriginY>     <Glyphs.Sideways>     <Glyphs.StyleSimulations>    <Glyphs.UnicodeString>

Geometry for Path, Clip <Path.Data> Attributes: [none] Property valueexpressed as a single XML child element:  [Path.Data has exactly onetotal of these children] <GeometryCollection> <PathGeometry><GeometryCollection> Attributes: CombineMode Child Elements:  [1-Nchildren] <GeometryCollection> <PathGeometry> <PathGeometry> Attributes:FillRule Child Elements:  [1-N children] <PathFigure> <PathFigure>Attributes: [None] Child Elements:  [StartSegment comes first,CloseSegment last, 1-N of Poly* in  between.] <StartSegment><PolyLineSegment> <PolyBezierSegment> <CloseSegment> <StartSegment>Properties expressed via simple XML attributes directly: PointProperties expressed via XML child elements <StartSegment.Point><PolyLineSegment> Properties expressed via simple XML attributesdirectly: Points Properties expressed via XML child elements<PolyLineSegment.Points> <PolyBezierSegment> Properties expressed viasimple XML attributes directly: Points Properties expressed via XMLchild elements <PolyBezierSegment.Points> Brushes used to fill a Path,Glyphs, or OpacityMask <Path.Fill> Attributes: [none] Property valueexpressed as a single XML child element:  [Path.Fill has exactly one ofthese children] <SolidColorBrush> <ImageBrush> <DrawingBrush><LinearGradientBrush> <RadialGradientBrush> <Glyphs.Fill> Attributes:[none] Property value expressed as a single XML child element: [Glyphs.Fill has exactly one of these children] <SolidColorBrush><ImageBrush> <DrawingBrush> <LinearGradientBrush> <RadialGradientBrush><SolidColorBrush> Properties expressed via simple XML attributesdirectly: Opacity Color Properties expressed via XML child elements<SolidColorBrush.Opacity> <SolidColorBrush.Color> <ImageBrush>Properties expressed via simple XML attributes directly: OpacityHorizontalAlignment VerticalAlignment ViewBox ViewPort Stretch TileModeContentUnits ViewportUnits ImageSource Properties expressed via resourcedictionary reference: Transform Properties expressed via XML childelements <ImageBrush.Opacity> <ImageBrush.Transform><ImageBrush.HorizontalAlignment> <ImageBrush.VerticalAlignment><ImageBrush.ViewBox> <ImageBrush.ViewPort> <ImageBrush.Stretch><ImageBrush.TileMode> <ImageBrush.ContentUnits><ImageBrush.ViewportUnits> <ImageBrush.ImageSource> <DrawingBrush>Properties expressed via simple XML attributes directly: OpacityHorizontalAlignment VerticalAlignment ViewBox ViewPort Stretch TileModeContentUnits ViewportUnits Properties expressed via resource dictionaryreference: Transform Drawing Properties expressed via XML child elements<DrawingBrush.Opacity> <DrawingBrush.Transform><DrawingBrush.HorizontalAlignment> <DrawingBrush.VerticalAlignment><DrawingBrush.ViewBox> <DrawingBrush.ViewPort> <DrawingBrush.Stretch><DrawingBrush.TileMode> <DrawingBrush.ContentUnits><DrawingBrush.ViewportUnits> <DrawingBrush.Drawing><DrawingBrush.Drawing> Content via XML child Elements: <Canvas> <Path><Glyphs> <LinearGradientBrush> Properties expressed via simple XMLattributes directly: Opacity MappingMode SpreadMethod StartPointEndPoint Properties expressed via resource dictionary reference:Transform GradientStops Properties expressed via XML child elements<LinearGradientBrush.Opacity> <LinearGradientBrush.Transform><LinearGradientBrush.MappingMode> <LinearGradientBrush.SpreadMethod><LinearGradientBrush.StartPoint> <LinearGradientBrush.EndPoint><LinearGradientBrush.GradientStops> <RadialGradientBrush> Propertiesexpressed via simple XML attributes directly: Opacity Center FocusRadiusX RadiusY Properties expressed via resource dictionary reference:Transform GradientStops Properties expressed via XML child elements<RadialGradientBrush.Opacity> <RadialGradientBrush.Transform><RadialGradientBrush.Center> <RadialGradientBrush.Focus><RadialGradientBrush.RadiusX> <RadialGradientBrush.RadiusY><RadialGradientBrush.GradientStops> <GradientStops> Content via XMLchild Elements: <GradientStop>    [1-N of these children] <GradientStop>Properties expressed via simple XML attributes directly: Color OffsetProperties expressed via XML child elements <GradientStop.Color><GradientStop.Offset>

Resource dictionaries for FixedPage or Canvas <FixedPage.Resources><Canvas.Resources>

These elements are discussed above in the section that discussesResource Dictionaries. Opacity masks for alpha transparency<Canvas.OpacityMask>   Attributes:     [none]   Property value expressedas a single XML child element:    [Canvas.OpacityMask has exactly one ofthese children]     <SolidColorBrush>     <ImageBrush>    <DrawingBrush>     <LinearGradientBrush>     <RadialGradientBrush><Path.OpacityMask>   Attributes:     [none]   Property value expressedas a single XML child element:    [Path.OpacityMask has exactly one ofthese children]     <SolidColorBrush>     <ImageBrush>    <DrawingBrush>     <LinearGradientBrush>     <RadialGradientBrush><Glyphs.OpacityMask>   Attributes:     [none]   Property value expressedas a single XML child element:    [Glyphs.OpacityMask has exactly one ofthese children]     <SolidColorBrush>     <ImageBrush>    <DrawingBrush>     <LinearGradientBrush>     <RadialGradientBrush>Clipping paths <Canvas.Clip>   Attributes:     [none]   Property valueexpressed as a single XML child element:    [Canvas.Clip has exactly oneof these children]     <GeometryCollection>     <PathGeometry><Path.Clip>   Attributes:     [none]   Property value expressed as asingle XML child element:    [Path.Clip has exactly one of thesechildren]     <GeometryCollection>     <PathGeometry> <Glyphs.Clip>  Attributes:     [none]   Property value expressed as a single XMLchild element:    [Glyphs.Clip has exactly one of these children]    <GeometryCollection>     <PathGeometry>

Transforms <Canvas.RenderTransform>   Property value expressed as asingle XML child element:     <MatrixTransform> [required]<Path.RenderTransform>   Property value expressed as a single XML childelement:     <MatrixTransform> [required] <Glyphs.RenderTransform>  Property value expressed as a single XML child element:    <MatrixTransform> [required] <MatrixTransform>   Propertiesexpressed via simple XML attributes directly:     Matrix   Propertiesexpressed via XML child elements     <MatrixTransform.Matrix><ImageBrush.Transform>   Properties expressed via simple XML attributesdirectly:     MatrixTransform   Properties expressed via XML childelements     <ImageBrush.Transform.MatrixTransform><DrawingBrush.Transform>   Properties expressed via simple XMLattributes directly:     MatrixTransform   Properties expressed via XMLchild elements     <DrawingBrush.Transform.MatrixTransform><LinearGradientBrush.Transform>   Properties expressed via simple XMLattributes directly:     MatrixTransform   Properties expressed via XMLchild elements     <LinearGradientBrush.Transform.MatrixTransform><RadialGradientBrush.Transform>   Properties expressed via simple XMLattributes directly:     MatrixTransform   Properties expressed via XMLchild elements     <RadialGradientBrush.Transform.MatrixTransform>

FixedPage Markup

Each FixedPage part represents a page's contents in XML markup rooted ina <FixedPage> element. This FixedPage markup provides WYSIWYG fidelityof a document between writers and readers, with only a small set ofelements and properties: <Path> and <Glyphs> elements (which together doall of the drawing), and the <Canvas> element to group them.

Common Element Properties

Before discussing attributes specific to each element in FixedPagemarkup, consider the attributes common to the drawing and groupingelements: Opacity, Clip, RenderTransform, and OpacityMask. Not only arethese the only properties common to the top-level elements, they arealso the only properties that “accumulate” their results from parent tochild element, as described in the Composition Rules section above. Theaccumulation is a result of the application of the Composition Rules.The table that follows provides a summary description of these commonattributes, followed by a more thorough discussion of each of theattributes. Elements Description Attribute Opacity Canvas, Path, Glyphs,Defines uniform transparency and of the element SolidColorBrush,ImageBrush, DrawingBrush, LinearGradientBrush, RadialGradientBrush ChildElement Clip Canvas, Path, Glyphs Clip restricts the region to which abrush can be applied on the canvas. RenderTransform Canvas, Path, GlyphsRenderTransform establishes a new coordinate frame for the children ofthe element. Only MatrixTransform supported OpacityMask Canvas, Path,Glyphs Specifies a rectangular mask of alpha values that is applied inthe same fashion as the Opacity attribute, but allow different alphavalue on a pixel-by-pixel basis

Opacity Attribute

Opacity is used to transparently blend the two elements when rendering(Alpha Blending). The Opacity attribute ranges from 0 (fullytransparent) to 1 (fully opaque). Values outside of this inclusive rangeare clamped to this range during markup parsing. So, effectively, [−∞ .. . 0] is transparent and [1 . . . ∞] is opaque.

The Opacity Attribute is applied through the following computations(assuming non-premultiplied source and destination colors, bothspecified as scRGB):

O_(E): Opacity attribute of element or alpha value at correspondingposition in OpacityMask

A_(S): Alpha value present in source surface

R_(S): Red value present in source surface

G_(S): Green value present in source surface

B_(S): Blue value present in source surface

A_(D): Alpha value already present in destination surface

R_(D): Red value already present in destination surface

G_(D): Green value already present in destination surface

B_(D): Blue value already present in destination surface

A*: Resulting Alpha value for destination surface

R*: Resulting Red value for destination surface

G*: Resulting Green value for destination surface

B*: Resulting Blue value for destination surface

All values designated with a T subscript are temporary values (e.g.R_(T1)).

Step 1: Multiply Source Alpha Value with Opacity ValueA _(S) =A _(S) *O _(E)

Step 2: Premultiply Source AlphaA_(T1)=A_(S)R _(T1) =R _(S) *A _(S)G _(T1) =G _(S) *A _(S)B _(T1) =B _(S) *A _(S)

Step 3: Premultiply Destination AlphaA_(T2)=A_(D)R _(T2) =R _(D) *A _(D)G _(T2) =G _(D) *A _(D)B _(T2) =B _(D) *A _(D)

Step 3: BlendA _(T2)=(1−A _(T1))*A _(T2) +A _(T1)R _(T2)=(1−A _(T1))*R _(T2) +R _(T1)G _(T2)=(1−A _(T1))*G _(T2) +G _(T1)B _(T2)=(1−A _(T1))*B _(T2) +B _(T1)

Step 4: Reverse Pre-multiplicationIf A _(T2)=0, set all A*R*G*B* to 0.

Else:A*=A _(T2)R*=R _(T2) /A _(T2)G*=G _(T2) /A _(T2)B*=B _(T2) /A _(T2)

Clip Property

The Clip property is specified as one of the geometric elements<GeometryCollection> or <PathGeometry> (see Path.Data for details).

The Clip property is applied in the following way:

-   -   All rendered contents that fall inside the geometric element        described by the Clip child element are visible.    -   All rendered contents that fall outside the geometric element        described by the Clip child element are not visible.

RenderTransform Child Element

MatrixTransform is the only transformation attribute available toelements. It expresses an affine transformation. The syntax follows:    <X.RenderTransform>       <MatrixTransform Matrix=”1,0,0,1,0,0”/>    </X.RenderTransform> X represents the element to which the transformis applied.

The six numbers specified in the Matrix attribute are m00, m01, m10,m11, dx, dy.

The full matrix looks like: m00 m01 0 m10 m11 0 dx dy 1

A given coordinate X,Y is transformed with a RenderTransform to yieldthe resulting coordinate X′,Y′ by applying these computations:X′=X*m 00+Y*m 10+dxY′=X*m 01+Y*m 11+dy

OpacityMask Child Element The OpacityMask specifies a Brush, but incontrast to a Fill Brush, only the alpha channel (see Opacity attributeabove) of the brush is used as an additional parameter for rendering theelement. Each alpha value for each pixel of the element is thenadditionally multiplied with the alpha value at the correspondingposition in the OpacityMask Brush.

The <Canvas> Element

The <Canvas> element is used to group elements together. Typically,FixedPage elements are grouped together in a <Canvas> when they share acomposed common attribute (i.e., Opacity, Clip, RenderTransforrn, orOpacityMask). By grouping these elements together on a Canvas, commonattributes can often be applied to the canvas instead of to theindividual elements.

Attributes and Child Elements of <Canvas>

The <Canvas> element has only the common attributes described earlier:Opacity, Clip, RenderTransform, and OpacityMask. They are used with the<Canvas> element as described in the table below: Effect on CanvasAttribute Opacity Defines uniform transparency of the canvas ChildElement Clip Clip describes the region to which a brush can be appliedby the Canvas' child elements. RenderTransform RenderTransformestablishes a new coordinate frame for the children elements of thecanvas, such as another canvas. Only MatrixTransform supportedOpacityMask Specifies a rectangular mask of alpha values that is appliedin the same fashion as the Opacity attribute, but allow different alphavalue on a pixel-by-pixel basis

The following markup example illustrates the use of <Canvas>. <Canvas>  <Path Fill=”#0000FF”>     <Path.Data>       <PathGeometry>        <PathFigure>           <StartSegment Point=”0,0”/>          <PolylineSegment Points=”100,0 100,100 0,100 0,0”/>          <CloseSegment/>         </PathFigure>       </PathGeometry>    </Path.Data>   </Path> </Canvas>

With respect to the reading order in Canvas markup, consider thefollowing. As with FixedPage, the markup order of the Glyphs childelements contained within a Canvas must be the same as the desiredreading order of the text content. This reading order may be used bothfor interactive selection/copy of sequential text from a FixedPage in aviewer, and for enabling access to sequential text by accessibilitytechnology. It is the responsibility of the application generating theFixedPage markup to ensure this correspondence between markup order andreading order.

Child Glyphs elements contained within nested Canvas elements areordered in-line between sibling Glyphs elements occurring before andafter the Canvas.

Example: <FixedPage>   <Glyphs . . . UnicodeString=”Now is the time for“ />   <Canvas>     <Glyphs . . . UnicodeString=”all good men and women“ />     <Glyphs . . . UnicodeString=”to come to the aid “ />  </Canvas>   <Glyphs . . . UnicodeString=”of the party.“ /></FixedPage>

The <Path> Element

The Path Element is an XML-based element that describes a geometricregion. The geometric region is a shape which may be filled, or used asa clipping path. Common geometry types, such as rectangle and ellipse,can be represented using Path geometries. A path is described byspecifying the required Geometry.Data child element and the renderingattributes, such as Fill or Opacity.

Properties and Child Elements of <Path>

The following properties are applicable to <Path> elements as describedbelow: Effect on Path Properties Opacity Defines uniform transparency ofthe filled path. Child Element Clip Clip describes the region to which abrush can be applied by the path's geometry. RenderTransformRenderTransform establishes a new coordinate frame for the childrenelements of the path, such as the geometry defined by Path.Data. OnlyMatrixTransform supported OpacityMask Specifies a rectangular mask ofalpha values that is applied in the same fashion as the Opacityattribute, but allows different alpha value for different areas of thesurface Data Describes the path's geometry. Fill Describes the brushused to paint the path's geometry.

To describe how to paint a region described by the geometry of the<Path.Data> child element, use the Fill property. To restrict the regionon which <Path.Data> shapes can be drawn, use the Clip property.

Using <Path> to Describe Geometries

A path's geometry is specified as a series of nested child elements of<Path.Data>, as shown below. The geometry may be represented with eithera <GeometryCollection> containing a set of <PathGeometry> childelements, or a single <PathGeometry> child element containing<PathFigures>. <Path>   <Path.Data>     <GeometryCollection>      <PathGeometry>         <PathFigure>           ...        </PathFigure>       </PathGeometry>     </GeometryCollection>  </Path.Data> <Path>

The same <GeometryCollection> or <PathGeometry> elements define thegeometry for a clipping path used in the Clip property of Canvas, Path,or Glyphs.

The following table introduces the hierarchy of child elements definingPath geometries. Geometry Elements Description GeometryCollection A setof PathGeometry elements rendered using Boolean CombineMode operations.PathGeometry A set of PathFigure elements that are each filled using thesame FillRule option. PathFigure A set of one or more segment elementsStartSegment, PolyLineSegment PolyBezierSegment CloseSegment

GeometryCollection

A GeometryCollection is a set of geometric objects that are combinedtogether for rendering according to Boolean CombineMode options. TheGeometryCollection element is the mechanism in FixedPage markup forbuilding visual combinations of geometric shapes. Attributes Effect onGeometryCollection CombineMode Specifies different modes for combininggeometries.

The CombineMode attribute specifies the Boolean operation used tocombine the set of geometric shapes in a GeometryCollection. Dependingon the mode, different regions will be included or excluded. CombineModeOptions Description Complement Specifies that the existing region isreplaced by the result of the existing region being removed from the newregion. Said differently, the existing region is excluded from the newregion. Exclude Specifies that the existing region is replaced by theresult of the new region being removed from the existing region. Saiddifferently, the new region is excluded from the existing region.Intersect Two regions are combined by taking their intersection. UnionTwo regions are combined by taking the union of both. Xor Two regionsare combined by taking only the areas enclosed by one or the otherregion, but not both.

CombineModes are handled as follows:

-   Not Commutative Complement and Exclude are not commutative and    therefore are defined between the first geometry in the    GeometryCollection and each individual remaining geometries. For    example, for the set {g1, g2, g3} a CombineMode of Exclude would be    applied as ((g1 exclude g2) and (g1 exclude g3)).-   Commutative Boolean operations Union, Xor, Intersect are commutative    and therefore apply order-independent to the geometries.

PathGeometry

A PathGeometry element contains a set of PathFigure elements. The unionof the PathFigures defines the interior of the PathGeometry. AttributesEffect on GeometryCollection FillRule Specifies alternate algorithms forfilling paths that describe an enclosed area.

With respect to the FillRule attribute, consider the following. Thefilled area of PathGeometry is defined by taking all of the containedPathFigure that have their Filled attribute set to true and applying theFillRule to determine the enclosed area. FillRule options specify howthe intersecting areas of Figure elements contained in a Geometry arecombined to form the resulting area of the Geometry.

In accordance with the described embodiment, EvenOdd Fill and NonZeroFill algorithms are provided.

The EvenOdd Fill algorithm determines the “insideness” of a point on thecanvas by drawing a ray from that point to infinity in any direction andthen examining the places where a segment of the shape crosses the ray.Starting with a count of zero, add one each time a Segment crosses theray from left to right and subtract one each time a path segment crossesthe ray from right to left. After counting the crossings, if the resultis zero then the point is outside the path. Otherwise, it is inside.

The NonZero Fill algorithm determines the “insideness” of a point on thecanvas by drawing a ray from that point to infinity in any direction andcounting the number of path Segments from the given shape that the raycrosses. If this number is odd, the point is inside; if even, the pointis outside.

PathFigure

A PathFigure element is composed of a set of one or more line or curvesegments. The segment elements define the shape of the PathFigure. ThePathFigure must always define a closed shape. Attributes Effect onPathFigure FillRule Specifies alternate algorithms for filling pathsthat describe an enclosed area.

A figure requires a starting point, after which each line or curvesegment continues from the last point added. The first segment in thePathFigure set must be a StartSegment, and CloseSegment must be the lastsegment. StartSegment has a Point attribute. CloseSegment has noattributes. StartSegment Attribute Description Point The location of theline segment (starting point).

Fixed-Payload Markup for Path.Data Geometries

The following provides the markup for drawing and filling a Path on aCanvas. In the specific example below, a rectangular Path is drawn on aCanvas and filled with a solid green brush. <Canvas>   <PathFill=”#0000FF”>     <Path.Data>       <PathGeometry>        <PathFigure>           <StartSegment Point=”0,0”/>          <PolylineSegment Points=”100,0 100,100 0,100 0,0”/>          <CloseSegment/>         </PathFigure>       </PathGeometry>    </Path.Data>   </Path> </Canvas>

The following markup describes drawing a cubic Bézier curve. That is, inaddition to the PolyLineSegment, Fixed-Payload markup includes thePolyBezierSegment for drawing cubic Bézier curves. <Canvas>   <PathFill=”#0000FF”>     <Path.Data>       <PathGeometry>        <PathFigure>           <StartSegment Point=”0,0”/>          <Polybeziersegment Points=”100,0 100,100 0,100 0,0”/>          <CloseSegment/>         </PathFigure>       </PathGeometry>    </Path.Data>   </Path> </Canvas>

Brushes

A brush is used to paint the interior of geometric shapes defined by the<Path> element, and to fill the character bitmaps rendered with a<Glyphs> element. A brush is also used in defining thealpha-transparency mask in <Canvas.OpacityMask>, <Path.OpacityMask>, and<Glyphs.OpacityMask>. The FixedPage markup includes the followingbrushes: Brush Type Description SolidColorBrush Fills defined geometricregions with a solid color. ImageBrush Fills a region with an image.DrawingBrush Fills a region with a vector drawing. LinearGradientBrushFills a region with a linear gradient. RadialGradientBrush Fills aregion with a radial gradient.

Attributes vary across brushes, although all brushes have an Opacityattribute. The ImageBrush and DrawingBrush share tiling capabilities.The two gradient-fill brushes have attributes in common as well.

The use of a brush child element in markup is shown below: <Path>  <Path.Fill>     <SolidColorBrush Color=”#00FFFF”/>   </Path.Fill>  ... </Path>

Common Properties for Brushes

In accordance with the described embodiment, the following propertiesare applicable to all brushes, except for the simple brushSolidColorBrush, which has fewer optional child elements. AttributeBrush Type Description Opacity All brushes Child Element Brush TypeDescription Transform All brushes Describes a MatrixTransform except forapplied to the brush's coordinate SolidColorBrush space.

Common Attributes for DrawingBrush and ImageBrush HorizontalAlignmentDrawingBrush, ImageBrush Center, Left, or Right VerticalAlignmentDrawingBrush, ImageBrush Center, Bottom, or Top ViewBox DrawingBrush,ImageBrush ViewPort DrawingBrush, ImageBrush Stretch DrawingBrush,ImageBrush None, Fill, Uniform, or UniformToFill TileMode DrawingBrush,ImageBrush None, Tile, FlipY, FLipX, or FlipXY ContentUnitsDrawingBrush, ImageBrush Absolute or RelativeToBoundingBox ViewportUnitsDrawingBrush, ImageBrush Absolute or RelativeToBoundingBox

The Horizontal Alignment attribute specifies how the brush is alignedhorizontally within the area it fills out. The Vertical Alignmentattribute specifies how the brush is aligned vertically within the areait fills out. The ViewBox attribute has a default value of (0,0,0,0),interpreted as unset. When unset, no adjustment is made and the Stretchattribute is ignored. The viewbox specifies a new coordinate system forthe contents, i.e. redefines the extent and origin of the viewport. TheStretch attribute helps to specify how those contents map into theviewport. The value of the viewBox attribute is a list of four“unitless” numbers <min-x>, <min-y>, <width> and <height>, separated bywhitespace and/or a comma, and is of type Rect. The Viewbox rectspecifies the rectangle in user space that maps to the bounding box. Itworks the same as inserting a scaleX and scaleY. The Stretch attribute(in case the option is other than none) provides additional control forpreserving the aspect ratio of the graphics. An additionaltransformation is applied to all descendants of the given element toachieve the specified effect. If there is a transform on the Brush, itis applied “above” the mapping to ViewBox.

The Stretch attribute has the following modes: None, Fill, Uniform,UniformToFill. Stretch Attribute Option Description None Default.Preserve original size. Fill Aspect ratio is not preserved and thecontent is scaled to fill the bounds established. Uniform Scale sizeuniformly until the image fits the bounds established. UniformToFillScale size uniformly to fill the bounds established and clip asnecessary.

Simple Brushes and their Attributes

The Path.Brush and Canvas.Brush child elements include the following:SolidColorBrush, ImageBrush, and DrawingBrush.

SolidColorBrush fills defined geometric regions with a solid color. Ifthere is an alpha component of the color, it is combined in amultiplicative way with the corresponding opacity attribute in theBrush. Attributes Effect Color Specifies color for filled elements

The following example illustrates how color attributes are expressed forthe SolidColorBrush. <Path>   <Path.Fill>     <SolidColorBrushColor=”#00FFFF”/>   </Path.Fill>   ... </Path>

ImageBrush can be used to fill a space with an image. The markup forImageBrush allows a URI to be specified. If all other attributes are setto their default values, the image will be stretched to fill thebounding box of the region. Attributes Effect ImageSource Specifies URIof image resource.

The ImageSource attribute must reference either one of the supportedReach Image Formats or a selector which leads to an image of one ofthese types.

DrawingBrush can be used to fill a space with a vector drawing.DrawingBrush has a Drawing Child Element, whose use in markup is shownbelow. <Path>   <Path.Fill>     <DrawingBrush>      <DrawingBrush.Drawing>         <Drawing>           <Path ... />          <Glyphs ... />         </Drawing>      </DrawingBrush.Drawing>     </DrawingBrush>   </Path.Fill> </Path>

Gradient Brushes and their Attributes

Gradients are drawn by specifying a set of gradient stops as XML ChildElements of the gradient brushes. These gradient stops specify thecolors along some sort of progression. There are two types of gradientbrushes supported in this framework: linear and radial.

The gradient is by drawn by doing interpolations between the gradientstops in the specified color space. LinearGradientBrush andGradientBrush share the following common attributes: AttributeDescription SpreadMethod This property describes how the brush shouldfill the content area outside of the primary, initial gradient area.Default value is Pad. MappingMode This property determines whether theparameters describing the gradient are interpreted relative to theobject bounding box. Default value is relative-to- bounding-box. Childelement Description GradientStops Holds an ordered sequence ofGradientStop elements

With respect to the SpreadMethod attribute, consider the following.SpreadMethod options specify how the space is filled. The default valueis Pad. SpreadMethod Attribute Options Effect on Gradient Pad The firstcolor and the last color are used to fill the remaining space at thebeginning and end, respectively. Reflect The gradient stops are replayedin reverse order repeatedly to fill the space. Repeat The gradient stopsare repeated in order until the space is filled.

MappingMode Attribute

With respect to the LinearGradientBrush, consider the following. TheLinearGradientBrush specifies a linear gradient brush along a vector.Attribute Description EndPoint End point of the linear gradient. TheLinearGradientBrush interpolates the colors from the StartPoint to theEndPoint, where StartPoint represents offset 0, and the EndPointrepresents offset 1. Default is 1, 1. StartPoint Start point of thelinear gradient.

The following markup example shows the use of the LinearGradientBrush. Apage with a rectangular path is filled with a linear gradient: <FixedPanel>  <FixedPage>   <Path>    <Path.Fill>    <LinearGradientBrush StartPoint=“0,0” Endpoint=“1,0”>     <LinearGradientBrush.GradientStops>       <GradientStopCollection>       <GradientStop Color=“#FF0000” Offset=“0”/>        <GradientStopColor=“#0000FF” Offset=“1”/>       </GradientStopCollection>     </LinearGradientBrush.GradientStops>     </LinearGradientBrush>   </Path.Fill>    <Path.Data>     <PathGeometry>      <PathFigure>      <StartSegment Point=“0,0”/>       <PolyLineSegment Points=“100,0100,100 0,100”/>       <CloseSegment/>      </PathFigure>    </PathGeometry>    </Path.Data>   </Path>  </FixedPage> </FixedPanel>

This example shows a page with a rectangular path that is filled with alinear gradient. The Path also has a clip property in the shape of anoctagon which clips it. <FixedPanel>  <FixedPage>   <Path>   <Path.Clip>     <PathGeometry>     <PathFigure>       <StartSegmentPoint=“25,0”/>       <PolyLineSegment Points=“75,0 100,25 100,75 75,10025,100 0,75 0,25”/>       <CloseSegment/>      </PathFigure>    </PathGeometry>    </Path.Clip>    <Path.Fill>    <LinearGradientBrush StartPoint=“0,0” EndPoint=“1,0”>     <LinearGradientBrush.GradientStops>       <GradientStopCollection>       <GradientStop Color=“#FF0000” Offset=“0”/>        <GradientStopColor=“#0000FF” Offset=“1”/>       </GradientStopCollection>     </LinearGradientBrush.GradientStops>     </LinearGradientBrush>   </Path.Fill>    <Path.Data>     <PathGeometry>      <PathFigure>      <StartSegment Point=“0,0”/>       <PolyLineSegment Points=“100,0100,100 0,100”/>       <CloseSegment/>      </PathFigure>    </PathGeometry>    </Path.Data>   </Path>  </FixedPage></FixedPanel>

The RadialGradient is similar in programming model to the lineargradient. However, whereas the linear gradient has a start and end pointto define the gradient vector, the radial gradient has a circle alongwith a focal point to define the gradient behavior. The circle definesthe end point of the gradient—in other words, a gradient stop at 1.0defines the color at the circle's circumference. The focal point definescenter of the gradient. A gradient stop at 0.0 defines the color at thefocal point. Attribute Description Center Center point of this radialgradient. The RadialGradientBrush interpolates the colors from the Focusto the circumference of the ellipse. The circumference is determined bythe Center and the radii. Default is 0.5, 0.5 Focus Focus of the radialgradient. RadiusX Radius in the X dimension of the ellipse which definesthe radial gradient. Default is 0.5 RadiusY Radius in the Y dimension ofthe ellipse which defines the radial gradient. Default is 0.5FillGradient Pad, Reflect, Repeat

Alpha and Transparency

In accordance with the illustrated and described embodiment, each pixelof each element carries an alpha value ranging from 0.0 (completelytransparent) to 1.0 (fully opaque). The alpha value is used whenblending elements to achieve the visual effect of transparency.

Each element can have an Opacity attribute with which the alpha value ofeach pixel of the element will be multiplied uniformly.

Additionally, the OpacityMask allow the specification of per-pixelopacity which will control how rendered content will be blended into itsdestination. The opacity specified by OpacityMask is combinedmultiplicatively with any opacity which may already happen to be presentin the alpha channel of the contents. The per-pixel Opacity specified bythe OpacityMask is determined by looking at the alpha channel of eachpixel in the mask—the color data is ignored.

The type of OpacityMask is Brush. This allows the specification of howthe Brush's content is mapped to the extent of the content in a varietyof different ways. Just as when used to fill geometry, the Brushesdefault to filling the entire content space, stretching or replicatingits content as appropriate. This means that an ImageBrush will stretchits ImageSource to completely cover the contents, a GradientBrush willextend from edge to edge.

The required computations for alpha blending are described in theearlier section “Opacity Attribute”.

The following example illustrates how an OpacityMask is used to create a“fade effect” on a Glyphs element. The OpacityMask in the example is alinear gradient that fades from opaque black to transparent black.//  /content/p1.xml <FixedPage PageHeight=”1056” PageWidth=”816”> <Glyphs   OriginX = “96”   OriginY = “96”   UnicodeString = “This isPage 1!”   FontUri = “../Fonts/Times.TTF”   FontRenderingEmSize = “16” >   <Glyphs.OpacityMask>    <LinearGradientBrush StartPoint=“0,0”EndPoint=“1,0”>    <LinearGradientBrush.GradientStops>     <GradientStopCollection>      <GradientStop Color=“#FF000000”Offset=“0”/>      <GradientStop Color=“#00000000” Offset=“1”/>     </GradientStopCollection>     </LinearGradientBrush.GradientStops>   </LinearGradientBrush>   </Glyphs.OpacityMask>  </Glyphs></FixedPage>

Images in Reach Documents

On FixedPages, images fill enclosed regions. To place an image on aFixedPage, a region must first be specified on the page. The region isdefined by the geometry of a Path element.

The Fill property of the Path element specifies the fill contents forthe described region. Images are one type of fill, drawn into a regionby the ImageBrush. All brushes have default behavior that will fill anentire region by either stretching or repeating (tiling) the brushcontent as appropriate. In the case of ImageBrush, the content specifiedby the ImageSource property will be stretched to completely cover theregion.

The markup below demonstrates how to place an image onto a Canvas.<Canvas>   <Path>     <Path.Data>       <GeometryCollection>         ...      </GeometryCollection>     </Path.Data>     <Path.Fill>      <ImageBrush ImageSource=”/images/dog.jpg” />     </Path.Fill>  </Path> </Canvas>

Since many images are rectangular, including a rectangular Path elementin the Resource Dictionary may be useful in simplifying the markup. ThePath can then be positioned using a RenderTransform attribute (seeabove). <Canvas>   <Canvas.Resources>     <PathGeometrydef:Name=”Rectangle”>       <PathFigure>         <StartSegmentPoint=”0,0”/>         <PolylineSegment Points=”100,0 100,100 0,100”/>        <CloseSegment/>       </PathFigure>     </PathGeometry>  </Canvas.Resources>   <Canvas>     <Canvas.RenderTransform>      <MatrixTransform Matrix=”1,0,0,1,100,100”/>    </Canvas.RenderTransform>     <Path Data=”{Rectangle}”>      <Path.Fill>         <ImageBrush ImageSource=”/images/dog.jpg” />      </Path.Fill>     </Path>   </Canvas> </Canvas>

Color

Colors can be specified in illustrated and described markup using scRGBor sRGB notation. The scRGB specification is known as “IEC 61966-2-2scRGB” and can be obtained from www.iec.ch

The ARGB parameters are described in the table below. Name Description RThe red scRGB component of the current color G The green scRGB componentof the current color B The blue scRGB component of the current color AThe alpha scRGB component of the current color

Color Mapping

Currently, consideration is being given to the tagging of coloredelements with metadata specifying color context. Such metadata couldcontain an ICC color profile, or other color definition data.

The <Glyphs> Element

Text is represented in Fixed Payloads using a Glyphs element. Theelement is designed to meet requirements for printing and reachdocuments.

Glyphs elements may have combinations of the following properties.Markup representation Property Purpose (Glyphs element) Origin Origin offirst glyph in run. The glyph Specified by is placed so that the leadingedge of its OriginX and advance vector and it's baseline OriginYintersect this point. properties FontRenderingEmSize Font size indrawing surface units Measured in (default 96ths of an inch) Lengthunits. FontHintingEmSize Size to hint for in points. Fonts may Measuredin include hinting to produce subtle doubles differences at differentsizes, such as representing thicker stems and more open bowls in pointssize of the smaller sizes, to produce results that font look more likethe same style than pure scaling can. This is not the same as hintingfor device pixel resolution, which is handled automatically. To date(March 2003) no known fonts include size hinting. Default value - 12pts. GlyphIndices Array of 16 bit glyph numbers that Part of Indicesrepresent this run. property. See below for representation AdvanceWidthsArray of advance widths, one for each Part of Indices glyph inGlyphIndices. The nominal property. See origin of the nth glyph in therun (n > 0) below for is the nominal origin of the n − 1th glyphrepresentation. plus the n − 1th advance width added along the runsadvance vector. Base glyphs generally have a non-zero advance width,combining glyphs generally have a zero advance width. GlyphOffsets Arrayof glyph offsets. Added to the Part of Indices nominal glyph origincalculated above property. See to generate the final origin for thebelow for glyph. representation. Base glyphs generally have a glyphoffset of (0, 0), combining glyphs generally have an offset that placesthem correctly on top of the nearest preceding base glyph. GlyphTypefaceThe physical font from which all FontUri, glyphs in this run are drawn.FontFaceIndex and StyleSimulations properties UnicodeString Optional*yes Array of characters represented by this glyph run. *Note that forGlyphRun's generated from Win32 printer drivers, text that wasoriginally printed by Win32 ExtTextOut(ETO_GLYPHINDEX) calls is passedto the driver with glyph indices and without Unicode codepoints. In thiscase, the generated Glyphs markup, and thus the constructed GlyphRunobject will omit the codepoints. With no codepoints, functionality suchas cut and past or search in a fixed format viewer are unavailable,however text display remains possible. ClusterMap One entry percharacter in Part of Indices UnicodeString. property. See Each valuegives the offset of the first below for glyph in GlyphIndices thatrepresents representation. the corresponding character in UnicodeString.Where multiple characters map to a single glyph, or where a singlecharacter maps to multiple glyphs, or where multiple characters map tomultiple glyphs indivisibly, the character or character(s) and glyph orglyph(s) are called a cluster. All entries in the ClusterMap for amulti-character cluster map to the offset in the GlyphIndices array ofthe first glyph of the cluster. Sideways The glyphs are laid out ontheir side. yes By default, glyphs are rendered as they would be inhorizontal text, with the origin corresponding to the Western baselineorigin. With the sideways flag set, the glyph is turned on it's side,with the origin being the top center of the unturned glyph. BidiLevelThe Unicode algorithm bidi nesting yes level. Numerically even valuesimply left-to-right layout, numerically odd values imply right-to-leftlayout. Right-to-left layout places the run origin at the right side ofthe first glyph, with positive values in the advance vector placingsubsequent glyphs to the left of the previous glyph. Brush Theforeground brush used to draw Picked up from glyphs the Shape Fillproperty. Language Language of the run, usually comes Specified by fromthe xml: lang property of markup. xml: lang property

Overview of Text Markup

Glyph Metrics

Each glyph defines metrics that specify how it aligns with other glyphs.Exemplary metrics in accordance with one embodiment are shown in FIG.12.

Advance Widths and Combining Marks

In general, glyphs within a font are either base glyphs or combiningmarks that may be attached to base glyphs. Base glyphs usually have anadvance width that is non-zero, and a 0,0 offset vector. Combining marksusually have a zero advance width. The offset vector may be used toadjust the position of a combining mark and so may have a non 0,0 valuefor combining marks.

Each glyph in the glyph run has three values controlling its position.The values indicate origin, advance width, and glyph offset, each ofwhich is described below:

-   -   Origin: Each glyph is assumed to be given a nominal origin, for        the first glyph in the run this is the origin of the run.    -   Advance Width: The advance width for each glyph provides the        origin of the next glyph relative to this glyphs origin. The        advance vector is always drawn in the direction of the run        progression.    -   Glyph Offset (Base or Mark): The glyph offset vector adjusts        this glyphs position relative to its nominal origin.

Characters, Glyphs, and the Cluster Map

Cluster maps contain one entry per Unicode codepoint. The value in theentry is the offset of the first glyph in the GlyphIndices array thatrepresents this codepoint. Alternately, where the codepoint is part of agroup of codepoints representing an indivisible character cluster, thefirst glyph in the GlyphIndices array represents the offset of the firstglyph that represents that cluster.

Cluster Mappings

The cluster map can represent codepoint-to-glyph mappings that areone-to-one, many-to-one, one-to-many, or many-to-many. One-to-onemappings are when each codepoint is represented by exactly one glyph,the cluster map entries in FIG. 13 are 0, 1, 2, . . . .

Many-to-one mappings are when two or more codepoints map to a singleglyph. The entries for those codepoints specify the offset of that glyphin the glpyh index buffer. In the example of FIG. 14, the ‘f’ and ‘i’characters have been replaced by a ligature, as is common typesettingpractice in many serif fonts.

With respect to one-to-many mappings, consider the following inconnection with FIG. 15. ‘Sara Am’ contains a part that sits on top ofthe previous base character (the ring), and a part that sits to theright of the base character (the hook). When Thai text ismicro-justified, the hook is spaced apart from the base character, whilethe ring remains on top of the base character, therefore many fontsencode the ring and the hook as separate glyphs. When one codepoint mapsto two or more glyphs, the value in the ClusterMap for that codepointreferences the first glyph in the GlyphIndeces array that representsthat codepoint.

With respect to many-to-many mappings, consider the following inconnection with FIG. 16. In some fonts an indivisible group ofcodepoints for a character cluster maps to more than one glyph. Forexample, this is common in fonts supporting Indic scripts. When anindivisible group of codepoints maps to one or more glyphs, the value inthe ClusterMap for each of the codepoints reference the first glyph inthe GlyphIndeces array that represents that codepoint.

The following example shows the Unicode and glyph representations of theTamil word

. The first two codepoints combine to generate three glyphs.

Specifying Clusters

Cluster specifications precede the glyph specification for the firstglyph of a non 1:1 cluster (mappings are more complex thanone-character-to-one-glyph).

Each cluster specification has the following form:(ClusterCodepointCount [:ClusterGlyphCount]) Cluster Defaultspecification part Type Purpose value ClusterCodepointCount positiveNumber of 16 bit Unicode 1 integer codepoints combining to form thiscluster ClusterGlyphCount positive Number of 16 bit glyph 1 integerindices combining to form this cluster

<Glyphs> Markup

The Glyphs element specifies a font as a URI, a face index and a set ofother attributes described above. For example: <Glyphs   FontUri =“file://c:/windows/fonts/times.ttf”   FontFaceIndex = “0” <!-- Default 0==>   FontRenderingEmSize = “20” <!-- No default -->   FontHintingEmSize= “12” <!-- Default 12 -->   StyleSimulations = “BoldSimulation” <!--Default None -->   Sideways = “false” <!-- Default false -->   BidiLevel= “0” <!-- Default 0 -->   Unicode = “ ... ” <!-- Unicode rep -->  Indices = “ ... ” <!-- See below -->   remaining attributes ... />

Each glyph specification has the following form:

[GlyphIndex][,[Advance][,[uOffset][,[vOffset][,[Flags]]]]].

Each part of the glyph specification is optional: Glyph specificationpart Purpose Default value GlyphIndex Index of glyph in the renderingphysical font As defined by the fonts character map table for thecorresponding Unicode codepoint in the inner text. Advance Placement fornext glyph relative to origin of this glyph. As defined by Measured indirection of advance as defined by the the fonts HMTX sideways andBidiLevel attributes. or VMTX font Measured in 100ths of the font emsize. metric tables. Advance must be calculated such that roundingerrors do not accumulate. See note below on how to achieve thisrequirement. uOffset, vOffset Offset relative to glyph origin to movethis glyph. 0, 0 Usually used to attach marks to base characters.Measured in 100ths of the font em size. Flags Distinguishes base glyphsand combining marks 0 (base glyph)

With respect to calculating advance without rounding error accumulationconsider the following. Each advance value must be calculated as theexact unrounded origin of the subsequent glyph minus the sum of thecalculated (i.e. rounded) advance widths of the preceding glyphs. Inthis way each glyph is positioned to within 0.5% of an em of its exactposition. <Canvas xmlns=“http://schemas.microsoft.com/2005/xaml/”><Glyphs   FontUri = “file://c:/windows/fonts/times.ttf”   FontFaceIndex= “0”   FontRenderingEmSize = “20”   FontHintingEmSize = “12”  StyleSimulations = “ItalicSimulation”   Sideways = “false”   BidiLevel= “0”   OriginX = “75”   OriginY = “75”   Fill = “#00FF00”  UnicodeString = “inner text ...” /> <!-- ‘Hello Windows’ withoutkerning --> <Glyphs   OriginX = “200”   OriginY = “50”   UnicodeString =“Hello, Windows!”   FontUri = “file://C:/Windows/Fonts/Times.TTF”   Fill= “#00FF00”   FontRenderingEmSize = “20” /> <!-- ‘Hello Windows’ withkerning --> <Glyphs   OriginX = “200”   OriginY = “150”   UnicodeString= “Hello, Windows!”   Indices = “;;;;;;;,89”   FontUri =“file://C:/Windows/Fonts/Times.TTF”   Fill = “#00FF00”  FontRenderingEmSize = “20” /> <!-- ‘Open file’ without ‘fi’ ligature--> <Glyphs   OriginX = “200”   OriginY = “250”   UnicodeString = “Openfile”   FontUri = “file://C:/Windows/Fonts/Times.TTF”   Fill = “#00FF00”  FontRenderingEmSize = “20” /> <!-- ‘Open file’ with ‘fi’ ligature --><Glyphs   Originx = “200”   OriginY = “350”   UnicodeString = “Openfile”   Indices = “;;;;;(2:1)191”   FontUri =“file://C:/Windows/Fonts/Times.TTF”   Fill = “#00FF00”  FontRenderingEmsize = “20” /> <!-- ‘

B TyMaHe’ using pre-composed ‘e’ --> <Glyphs   OriginX = “200”   OriginY= “450”   xml:lang = “ru-RU”   UnicodeString = “

B TyMaHe”   FontUri = “file://C:/Windows/Fonts/Times.TTF”   Fill =“#00FF00”   FontRenderingEmsize = “20” /> <!-- ‘

B TyMaHe’ using composition of ‘e’ and diaeresis --> <Glyphs   OriginX =“200”   OriginY = “500”   xml:lang = “ru-RU”   UnicodeString = “

B TyMaHe”   Indices = “(1:2)72;142,0,−45”   FontUri =“C:\/Windows\/Fonts\/Times.TTF”   Fill = “#00FF00”   FontRenderingEmSize= “20” /> <!-- ‘

B TyMaHe’ Forced rendering right-to-left showing combining mark inlogical order --> <Glyphs   OriginX = “200”   OriginY = “550”  BidiLevel = “1”   xml:lang = “ru-RU”   UnicodeString = “

B TyMaHe”   Indices = “(1:2)72;142,0,−45”   FontUri =“file://C:/Windows/Fonts/Times.TTF”   Fill = “#00FF00”  FontRenderingEmSize = “20” /> </Canvas>

Optimizing the Size of Glyphs Markup

Markup details, such as glyph indices and advance widths, can be omittedfrom the markup if a targeted client can regenerate them reliably. Thefollowing options allow dramatic optimization of commonly used simplescripts.

Optimizing Markup of Glyph Indices

Glyph indices may be omitted from markup where there is a one-to-onemapping between the positions of characters in the Unicode string andthe positions of glyphs in the glyph string, and the glyph index is thevalue in the CMAP (character mapping) table of the font, and the Unicodecharacter has unambiguous semantics.

Glyph indices should be provided in the markup where the mapping ofcharacters to glyphs:

-   -   is not one-to-one, such as where two or more codepoints form a        single glyph (ligature), or    -   one codepoint generates multiple glyphs, or    -   where any other form of glyph substitution has happened, such as        through application of an OpenType feature.

Glyph indices should be provided in markup where a rendering enginemight substitute a different glyph than that in the CMAP (charactermapping) table in the font. Glyph indices should be provided where thedesired glyph representation is not that in the CMAP table of the font.

Optimizing Markup of Glyph Positions

Glyph advance width may be omitted from the markup where the advancewidth required is exactly that for the glyph in the HMTX (horizontalmetrics) or VMTX (vertical metrics) tables of the font.

Glyph vertical offset may be omitted from the markup where it is zero.This is almost always true for base characters, and commonly true forcombining marks in simpler scripts. However, this is often false forcombining marks in more complex scripts such as Arabic and Indic.

Optimizing Markup of Glyph Flags

Glyph flags may be omitted for base glyphs with normal justificationpriority.

Conclusion

The above-described modular content framework and document formatmethods and systems provide a set of building blocks for composing,packaging, distributing, and rendering document-centered content. Thesebuilding blocks define a platform-independent framework for documentformats that enable software and hardware systems to generate, exchange,and display documents reliably and consistently. The illustrated anddescribed reach package format provides a format for storing paginatedor pre-paginated documents in a manner in which contents of a reachpackage can be displayed or printed with full fidelity among devices andapplications in a wide range of environments and across a wide range ofscenarios. Although the invention has been described in languagespecific to structural features and/or methodological steps, it is to beunderstood that the invention defined in the appended claims is notnecessarily limited to the specific features or steps described. Rather,the specific features and steps are disclosed as preferred forms ofimplementing the claimed invention.

1. A memory for storing data for access by an application program beingexecuted on a processor, comprising: a data structure stored in thememory, the data structure including a markup representation associatedwith a plurality of document parts, the markup representation including:a preferred content that is used by applications capable of processingthe preferred content; and a fallback content that is used byapplications incapable of processing the preferred content.
 2. Thememory of claim 1, wherein the markup representation further includes anelement that controls a manner in which applications handle unknownattributes.
 3. The memory of claim 1, wherein the markup representationfurther includes an element that identifies a namespace as ignorable. 4.The memory of claim 1, wherein the markup representation furtherincludes an element that identifies elements and attributes associatedwith a namespace as ignorable.
 5. The memory of claim 1, wherein themarkup representation further includes an element that specifiesbehavior for ignorable content.
 6. The memory of claim 1, wherein themarkup representation further includes an element that reverses theeffect of a declaring a namespace ignorable.
 7. The memory of claim 1,wherein the preferred content and the fallback content can be nested inan arbitrary manner.
 8. A programming interface embodied on one or morecomputer-readable media, comprising: a first group of services relatedto identifying attributes contained in a document; a second group ofservices related to determining a default handling behavior associatedwith each attribute in the document; a third group of services relatedto ignoring an attribute in the document if the attribute is notunderstood and the attribute's default handling behavior is set toignore new features; and a fourth group of services related to haltingprocessing of the document if the attribute is not understood and theattribute's default handling behavior is set to require understanding ofattributes.
 9. The programming interface of claim 8, further comprisinga fifth group of services related to processing attributes in thedocument if the attribute is understood.
 10. The programming interfaceof claim 8, wherein the first group of services includes a service foridentifying elements contained in the document.
 11. The programminginterface of claim 8, wherein the default handling behavior may varywhen processing different parts of the document.
 12. The programminginterface of claim 8, wherein the third group of services includes aservice for maintaining the ignored attribute for future use if a carryalong parameter is active.
 13. The programming interface of claim 8,wherein the third group of services includes a service for discardingthe ignored attribute if a carry along parameter is not active.
 14. Theprogramming interface of claim 8, where in the third group of servicesincludes a service for storing the attribute for use in a later process.15. The programming interface of claim 8, wherein the first group ofservices includes a service for determining compatibility rulesproperties.
 16. The programming interface of claim 8, further comprisinga fifth group of services related to identifying behavior associatedwith ignorable content.
 17. The programming interface of claim 8,wherein the first group of services includes a service for identifyingXML attributes.
 18. A software architecture comprising the programminginterface as recited in claim
 8. 19. A programming interface embodied onone or more computer-readable media, comprising: a first group ofservices related to defining a plurality of parts associated with adocument; and a second group of services related to associating a markuprepresentation with the plurality of parts, wherein the second group ofservices includes: a service for identifying a preferred content that isused by applications capable of processing the preferred content; and aservice for identifying a fallback content that is used by applicationsincapable of processing the preferred content.
 20. The programminginterface of claim 19, wherein the first group of services includes aservice for associating a name with each of the plurality of parts. 21.The programming interface of claim 19, wherein the second group ofservices includes a service for identifying an element that controls howapplications react to unknown attributes.
 22. The programming interfaceof claim 19, wherein the second group of services includes a service foridentifying an element that declares a namespace as ignorable.
 23. Theprogramming interface of claim 19, wherein the second group of servicesincludes a service for identifying an element that declares all elementsand attributes associated with a namespace as ignorable.
 24. Theprogramming interface of claim 19, wherein the second group of servicesincludes a service for identifying an element that specifies behaviorfor ignorable content.
 25. The programming interface of claim 19,wherein the second group of services includes a service for identifyingan element that reverses t he effect of a namespace declared ignorable.26. A software architecture comprising the programming interface asrecited in claim
 19. 27. A programming interface embodied on one or morecomputer-readable media, comprising: a first group of services relatedto identifying a document; a second group of services related todetermining a handling behavior associated with the document, whereinthe second group of services includes: a service for determiningbehavior associated with ignorable content in the document; a servicefor halting processing of the document if an element in the document isnot understood and the handling behavior associated with the documentrequires an understanding of the element; a third group of servicesrelated to rendering the document.
 28. The programming interface ofclaim 27, wherein the third group of services includes a service fordisplaying the document on a display device.
 29. The programminginterface of claim 27, wherein the third group of services includes aservice for printing the document on a printing device.
 30. Theprogramming interface of claim 27, wherein the third group of servicesincludes a service for transmitting the document to another device. 31.The programming interface of claim 27, wherein the second group ofservices includes a service for identifying XML elements.
 32. Anapparatus comprising: means for identifying attributes contained in adocument; means for determining a default handling behavior associatedwith each attribute in the document; and means for processing thedocument, the means for processing the document configured to: ignore anattribute in the document if the attribute is not understood and theattribute's default handling behavior is set to ignore new features; andhalt processing of the document if the attribute is not understood andthe attribute's default handling behavior is set to require anunderstanding of attributes.
 33. The apparatus of claim 32, wherein themeans for processing the document is further configured to processattributes in the document that are understood.
 34. The apparatus ofclaim 32, wherein the attributes include elements contained in thedocument.
 35. The apparatus of claim 32, wherein the default handlingbehavior varies based on the portion of the document being processed.36. The apparatus of claim 32, wherein the means for processing thedocument is further configured to: maintain the ignored attribute forfuture use if a carry along parameter is active; and discard the ignoredattribute if a carry along parameter is not active.
 37. The apparatus ofclaim 32, further comprising means for identifying behavior associatedwith ignorable content.
 38. The apparatus of claim 32, furthercomprising means for rendering the document.
 39. The apparatus of claim32, further comprising means for communicating the document to anotherdevice for rendering.
 40. An apparatus comprising: means for defining aplurality of parts associated with a document; means for identifying apreferred content used by applications capable of processing thepreferred content; means for identifying a fallback content used byapplications incapable of processing the preferred content; and meansfor defining a manner in which applications react to unknown attributesin the document.
 41. The apparatus of claim 40, further including meansfor defining an element that declares a namespace as ignorable.
 42. Theapparatus of claim 40, further including means for defining an elementthat declares all elements and attributes associated with a namespace asignorable.
 43. The apparatus of claim 40, further including means fordetermining an element that specifies behavior for ignorable content.